Offering an incentive to survey respondents can provide a major boost in response rates. However, how can we be sure that respondents aren’t quickly clicking through the survey’s questions to get to the prize at the end? What if some respondents lie in order to qualify for the survey? Is there any way to identify respondents who are answering untruthfully instead of carefully reading the questions and providing honest answers?

Though there is no one method that will guarantee the removal of all inaccurate responses from a survey, there are a variety of controls used for improving data quality. It is a careful balancing act of removing as many “bad” responses as possible without removing too many “good” responses. Research suggests that the optimal strategy employs four quality control criteria, using a mix of trap questions and analysis of responses on the back end, and removes any responses that fail two or more of the criteria [1]. Even respondents with good intentions can fail one of the data quality checks, but the chance that they will fail multiple checks is much lower.

Trap Questions
Trap questions exist in many different formats, with a few types standing out as more efficient than others [2]. One type of trap is a matrix question that includes two seemingly opposite statements. If someone is straight-lining, their answers will appear contradictory. For example, if a respondent chooses “Completely agree” for these two statements: “I am an avid fan of the team” and “I have no interest in watching the team’s games”, that would imply that they are not carefully answering each question.

Another type of trap question provides a list of uncommon activities and asks the participant to select which of the activities they have done in the past week; this control is failed if the participant chooses three or more activities. For example, it is extremely unlikely that someone has surfed, watched an Indian cricket match on TV, and attended a WNBA game all in one week. This type of trap question helps to identify respondents who are lying in order to qualify for a survey, which can be more of an issue for surveys getting supplemental data from online panels.

Although these two types of trap questions are the most efficient, a survey could also include a question that specifically tells the respondent which answer to select for quality control purposes. However this type of control leads to a higher rate of incorrectly flagging honest responses as potentially inaccurate [3].

Analysis of Response Data
Once the data from a survey has been collected, analysis of response times and open-ended answers can serve as two quality control checks. A respondent who doesn’t answer any open-ended questions or enters nonsense answers (e.g. “xxxxx”) may be untrustworthy and could be deleted from the data set.

The same goes for respondents who take the survey in an unusually short time. In this case, consider deleting any responses with completion times that are less than half of the median time, or that fall in the shortest 5% of completion times. Keep in mind that skip logic can affect the response time, so removing “speeders” may not be the best method for all surveys.

An alternative quality control is to check for straight-lining, which is when a respondent selects the same answer for every row in a matrix question, but this method is not quite as effective.

Implementing a full data quality control system for survey data can seem like overkill, but even employing one or two methods can improve data quality and strengthen the conclusions drawn from your data.