When you encounter statistics in an article, in a report from your marketing team, or on social media, they often include a margin of error. For example, a political poll might estimate that one candidate will get 58 percent of the vote "plus or minus 2.8 percent." The margin of error is one of the most important—and least attended to—aspects of statistics.
In business, statistical analysis is used every day to inform decisions, conduct market research, and make predictions for the future. A specific type of market research that businesses conduct is A/B testing, in which a controlled mini-experiment is run to compare the outcomes of two different choices.
To draw accurate conclusions, the data needs to be correct and the margin of error low. Otherwise, your conclusions might be unreliable.
Here’s a look at what the margin of error is and how you can decrease it when running A/B tests.
What Is the Margin of Error and Why Is It Important?
The margin of error represents the degree of error in random sampling results.
A/B tests are conducted using a sample of the population, and the margin of error tells you how many percentage points different your results are from the actual population value.
A low margin of error generally tells you that your results are likely to represent the entire population, whereas a high margin of error means that your results might not be accurate when extrapolated to the entire population.
Let’s look at an example of this in action for an A/B test.
Margin of Error in an A/B Test
A team at an online retail company might run an A/B test during which they randomly show a subset of their customers one version of a product page and the remaining customers a different version. The team is trying to see if specific aspects of the page affect how much money people spend on the product.
Perhaps people who see the first page spend an average of $28, and people who see the second page spend an average of $35. If we know that someone saw the first page, and we know nothing else about them, our best guess would be that they spent $28. Any difference between what is actually spent and $28 is within the margin of error. Similarly, for people who see the second page, the difference between actual spending and $35 is within the margin of error.
Some variability is always expected across the individuals in a sample, so there is likely also some difference between both sample groups in the A/B test, just by chance. If the errors are distributed in a predictable manner (usually in a bell-shaped curve, or normal distribution), you can estimate how much difference there should be between the two groups if the page had no effect. If the difference is greater than that estimate, you can assume the difference is due to which page they saw.
Luckily, there are certain things you can do to minimize the margin of error in A/B tests so you can collect accurate data and maintain confidence in your results.
How to Minimize Margin of Error
1. Increase Sample Size
When you conduct an A/B test, the participants only represent a subset of the entire population. Because of this, if your sample size is too small, the margin of error will ultimately increase. The larger your sample size gets, the higher the probability is that your results are representative of the entire population.
2. Consider Missing Variables
In an A/B test, there are many variables you might not be taking into consideration. In the aforementioned example, there are a large number of variables that could influence customer spending, including time of year, economic climate, individual information (such as income), and computer-related issues (such as how the individual found the site and how fast their Internet connection is).
If these variables can be easily collected and considered, you can make a more informed prediction, decreasing your margin of error.
3. Clean Up Mistakes and Remove Bad Data
Mistakes happen and can dramatically increase the margin of error. For instance, maybe a shopper wanted to buy two items, but accidentally added 22 to their cart. Maybe the analytics engine was configured incorrectly, or the dataset got corrupted somewhere along the way through human error or a technical problem.
You can minimize the effect of mistakes by taking time to review your data. This process is referred to as data wrangling and involves “cleaning” data to be more relevant and useful for its intended purpose. For your A/B test, removing outliers and any incorrect data is imperative to ensure the margin of error is slim.
4. Look for Misleading or False Information
Although it can be tricky to know the motivations of each individual in your A/B test, be on the lookout for any misleading or false information they may provide. For instance, maybe the person visiting your site is from a competing retailer and has no intention of buying the product—they’re just visiting the site to do research on the competition. While this source of error is relatively uncommon in behavioral data (such as purchasing a product), it’s common in self-reported data.
Survey respondents can lie about personal information, such as their behavior, political beliefs, age, and education. You may be able to correct for this by looking for strange or anomalous cases and data wrangling, just as you'd do to catch mistakes. You could also use a self-report scale that estimates various types of misleading information.
5. Eliminate Bias
Although individual variability is inevitable, there are some types of error you can control for. These non-random biases fall under the category of systematic error.
Systematic error leads to biased data, which will give you poor results. For example, if you decide to run one version of the product page for a month, and the other version the next month, the data may be biased based on time. If the first month is December and the second is January, or if there’s a major change to the stock market toward the end of the first month, the comparison won't be valid. That's because the people who see the two pages differ systematically.
Therefore, differences in spending between the pages are not due to random chance; some is due to systematic bias. This makes it impossible to determine how much is due to the difference between the pages. The best way to address this is through good study design. In this example, simply run both versions of the product page simultaneously to eliminate systematic bias.
Maintaining Confidence in Your Results
Although it's impossible to completely eliminate error, well-designed research keeps the margin of error as small as possible, and enables you to know how confident you can be of the results. It’s also important to know how to analyze your dataset of results to easily identify outliers and other factors that might be contributing to error.
Learning business analytics and data science skills can help you to conduct more accurate and effective A/B tests so that you can make more data-driven business decisions.
Do you want to learn how to apply fundamental quantitative methods to real business problems? Explore Business Analytics to find out how you can use data to inform business decisions.
This post was updated on June 2, 2021. It was originally published on May 23, 2017.