by Jerry W. Thomas
The world is awash in data from surveys of all types. The rise of low-cost, do-it-yourself survey tools has added to the flood of survey data. We can scarcely buy a toothpick without a follow-on survey to measure how happy we are with the toothpick. All of these surveys and the data they generate (often using relatively large samples, n > 1,000) tend to create a false sense of accuracy, based on the calculated standard error.
The standard error is a widely accepted measure of sampling error, and it is typically the basis for the footnote “the accuracy of this survey is 5 percentage points, plus or minus, at a 95% level of confidence” found in research reports or survey research results in newspapers, magazines, or websites.
The Standard Error assumes:
- A sample chosen by purely random methods from among the members of the target universe.
- All potential respondents do, in fact, respond to and participate in the survey (i.e., no response bias).
If these basic assumptions are met (and they rarely are), the standard error gives us a reasonably accurate measure of the sampling error in our survey data. However, there are other sources of survey error. Some of the most common are universe error and sample screening error.
If a mistake is made in defining the universe for a survey, the results can be very inaccurate. For example, if the universe for a liquor survey is defined as males 21 to 29 based on a belief that younger males account for the bulk of liquor sales, the survey results could be completely misleading. The truth is that people 50 plus in age account for a large share of liquor sales. This is a good example of a universe definition error.
Another example: For many years, a U.S. automotive manufacturer defined its sampling universe as U.S. drivers who did not prefer Japanese or German cars over American-produced cars. This misdefinition of the sampling universe made all of the survey results appear to be much more positive (compared to a random sample). The auto manufacturer’s executives could feel good about the all the positive results coming in from their research department. It’s easy to envision the huge level of error introduced into this company’s surveys by omitting consumers who preferred Japanese or German cars. Universe definition errors, in most cases, will change survey results by large amounts (usually much larger changes than those related to sampling error).
Another example of universe error is what happens in a typical customer satisfaction tracking survey. People who are unhappy with the company's services or products stop buying, so they are no longer customers. The company's executives are happy because satisfaction survey results are gradually getting better month by month, as unhappy customers drop out of the sample by becoming noncustomers. It is not uncommon for a company with declining customer counts to see its customer satisfaction ratings going up. The universe is changing over time, as unhappy customers leave.
Sample Screening Error
Sample screening errors are similar to universe definition errors in that we end up with a group that’s not representative of the target universe. Most surveys consist of two parts: a screener to filter out consumers who don’t qualify for the survey and a questionnaire for those who qualify for the survey. Sample screening errors are common, and can introduce large errors into survey results. For example, let’s suppose a company wanted to survey people likely to buy a new refrigerator in the next three years. Such a survey might screen out consumers who purchased a new refrigerator in the past three years on the assumption these individuals are “out of the market.” Nothing could be further from the truth. That refrigerator buyer might decide to buy a second refrigerator for the home, or might buy a second home that needs a new refrigerator, or might buy a refrigerator for her adult daughter. That buyer might have a growing family that demands the purchase of a new, larger refrigerator, or Momma might just get tired of the old refrigerator and want a new one. Survey results based on a final sample that excluded these past buyers of refrigerators could produce inaccurate results.
Another example of Sample Screening Error: Let's suppose a candy brand is completing 500 surveys a month to track its brand and advertising awareness. The brand decides to screen for people who have purchased candy in the past 30 days, and only these individuals are accepted into the survey. The month-to-month survey results look stable, but suddenly in late October and early November, brand awareness and advertising awareness decline. The brand manager is dismayed because this is the peak selling season for candy. The problem is the past-30-days screening requirement. During the time around Halloween (the peak candy season), a much larger share of households buy candy. As the infrequent buyers of candy flood into the market during Halloween (and flood into the tracking survey), brand and advertising awareness fall because these infrequent candy buyers don't know as much about candy and they aren't as interested in candy as the regulars who buy candy every month of the year.
It’s generally a good idea to keep your target universe as broad and inclusive as possible. It’s wise to be very, very careful about who you screen out of a survey, because you might be biasing the final results in unexpected ways. It’s usually better to keep people in your survey. You can always run cross-tabs to look at each group separately.
About the Author
Jerry W. Thomas (firstname.lastname@example.org) is President/CEO of Decision Analyst. He may be reached at 1-800-262-5974 or 1-817-640-6166.
Copyright © 2017 by Decision Analyst, Inc.
This posting may not be copied, published, or used in any way without written permission of Decision Analyst.