Fraud in Online Surveys

by: Justin Thomas

Types of Fraud

The incidence of fraud in online surveys is a growing problem. Fraud rates can be as high as 15% to 25% in some sources of online samples.

The research industry must minimize respondent fraud to protect the integrity of its survey results and safeguard the public’s trust in survey research findings. This blog explores some common types of fraud and outlines some approaches to minimizing fraud.

Fraudulent responses in online surveys generally fall into three categories: accidental, intentional, and intentional but it's complicated.

Accidental “fraud” occurs when survey participants are not aware they are taking the same survey multiple times. This happens when online surveys do not block duplicate entries. Another source of accidental errors can occur when a survey respondent is a member of multiple online panels that are all sending respondents the same survey. But in all cases, these respondents are not actually committing fraud. They are simply good people accidentally participating more than one time in the same survey.

Intentional fraud, on the other hand, occurs when a survey participant deliberately tries to complete a survey multiple times, or provides inaccurate answers to survey questions. The motives are either to alter the results of the survey or to reap extra financial incentives. These individuals routinely join and participate in as many online panels as they can. They have their browsers reset browser cookies and history after each survey attempt, make common use of “incognito modes,” and utilize VPN services (commercial and free) to circumvent any geographic restrictions. Online surveys that offer bigger incentives tend to attract more intentional fraud. These are individuals you definitely do not want in your online surveys.

Intentional-but-it's-complicated fraud occurs when survey participants attempt to complete as many surveys as possible, as accurately as possible. Unlike intentional fraud they are not completing the same survey multiple times, but rather are trying to complete as many different surveys as swiftly and as efficiently as possible. Sometimes these individuals are referred to as “professional” survey participants. It is still unclear if these “professional” survey participants are always a problem, but they should definitely be identified and undergo additional scrutiny.

Common Components of a Fraud Defense

The first component of a fraud defense is geolocation. Geolocation can be provided by the device itself (mobile devices, etc.), by a previous data point created by survey participants when signing up, or via their computer’s IP address. Geolocation results can generally be trusted at the country level, but they tend to become less accurate for smaller geographic areas (such as states).

Another widely used method that is used to track survey participants is a device/browser fingerprint. This “digital” fingerprint is built up from components and properties of the browser, such as fonts installed, plugins registered, screen size, color depth, and many other variables that uniquely identify a computer, tablet, or smartphone.

Another strategy is to put questions (i.e., cheater traps) in an online survey with nonsensical answers that might trap a participant who is rushing through a survey.

Cookies are also used to track respondents across and within online surveys. Cookies are widely used to prevent duplicate survey entries and are easy to implement.

Less Common Components of a Fraud Defense

Two less common and more difficult to interpret components of a fraud defense are survey metadata and open-end verbatim analysis. Survey metadata includes details like how long survey participants took to complete a particular survey or question, how long they spent writing an answer to an open-end question, or how many times they changed their answers. Consequently, determining what constitutes fraud from metadata requires some human legwork and tends to vary survey by survey.

The same is true of the analysis of open-end questions. While sentiment analysis is commonplace now, it is fraught with misclassifications for anything but the simplest of cases. Going a step further and performing more sophisticated analysis of open-ends to identify fraudulent responses is even more difficult. However, there are some simple things that can be done to help check for fraudulent open-ends. Checking for gibberish, repetitive answers, and off-topic responses can help identify fraud.

Final Thoughts

Catching a fraudulent respondent requires multiple layers of defense; in the final analysis, it usually comes down to a judgment call. It’s a “cat and mouse” game that the research industry and all research practitioners must play to preserve the purity of survey responses and help maintain public trust in the data our industry provides. Caveat emptor!

Author

Justin W. Thomas

Head of Software Development

Justin is primarily responsible for architecting and implementing interactive data analysis platforms, client facing dashboards, and automated reporting solutions. In a secondary capacity, Justin collaborates with the rest of the IT staff on implementing open source software solutions at Decision Analyst. Prior to Decision Analyst, he worked as a Programmer Analyst at Perot Systems based in Plano, Texas. Justin holds a Bachelor of Science in Management Information Systems from the University of Texas at Arlington and has spent over 20 years developing software.

Online Surveys Blog