How to improve your data quality

Last week we published a blog post detailing how Prolific verifies that our participants aren’t bots and that individuals aren't able to open multiple accounts. Thing is, when it comes to participant data, “not being a bot” is about the lowest bar you can set. This week, we want to say a bit more on the topic of data quality and give you some practical tips for minimising the amount of poor quality data in your dataset. We’ll also discuss some of the internal measures we have in place to catch participants who are dishonest.

Where does poor data come from?

Online data collection is revolutionizing the way we do research, yet it brings new risks regarding data quality. In the lab we can meet our participants face-to-face and monitor them while they complete the study. Online, we cannot be so sure that our participants are human, are who they say they are, are completing the study properly or are even paying attention.

Bots, liars, cheats and slackers…

At Prolific, we think of malicious participants as falling into roughly four groups: bots, liars, cheats and slackers.

Bots are autonomous or semi-autonomous pieces of software designed to complete online surveys with very little human intervention. Bots are often distinguished by random or very low effort/nonsensical free text responses. Thankfully, due to their obviously non-human answers, there are several methods for detecting bots in your data (see Dupuis et al., 2018). Unfortunately bad-acting humans can prove trickier to detect...

Liars (or to use the technical term: malingerers) submit false prescreening information in order to gain access to the largest numbers of studies possible, and consequently maximise their earning potential. The impact of liars on your data quality depends on two things:

  1. The importance of prescreeners to your study design: For example, in a study comparing male and female respondents, if some of each group aren’t who they claim to be, then this could invalidate the comparison and render the data unusable. In contrast, if sex is not a factor in your study design then it is irrelevant whether a respondent lied about their sex or not.

  2. The nicheness of your sample: Impostors are more common when recruiting rare populations (Chandler & Paolacci, 2017). This is because the competition for study places on Prolific is quite fierce, and studies with broad eligibility fill up fast. Accordingly, if you want to maximise earnings you’ll need to gain access to studies which fill more slowly by claiming to be a member of one (or many) niche demographics.

The third group are cheats: participants that deliberately submit false information to your study. Importantly, cheaters are not always intending to be dishonest: some may genuinely be confused about the data you’re trying to collect, or whether they’ll get paid even if they don’t do “well”. This can happen because they fear your study’s rewards are tied to their performance (i.e., they believe they will only be paid if they get 100% on a test, so they google the correct answers). Alternatively, they might think you only want a certain kind of response (i.e., therefore always giving very positive/enthusiastic responses) or use aides (pen and paper) to artificially perform much better than reality. The final kind of cheats are participants who don’t take your survey seriously, perhaps completing it with their friends, or while drunk. To clarify: Liars provide false demographic information to gain access to your study. Cheats provide false information within the study itself. A participant can be both a liar and a cheat, but their effects on data quality are different.

The fourth group are slackers. These participants are inattentive and typically aren’t focused on maximising their earnings. Rather, they are simply unmotivated to provide any genuine data for the price you’re paying. Slackers encompass a broad group: from participants that don’t read instructions properly, to participants that are completing your study while watching TV. They may input random answers, gibberish or low-effort free text. Importantly, slackers are not always dishonest: some may just consider the survey reward too low to be worth their full attention.

It’s worth highlighting that these groups are not independent. A liar can use bots, slackers can cheat, etc. In fact, it’s likely there’s a lot of overlap, because most bad actors don’t really care what methods they use to earn rewards, so long as they’re maximising their income!

So, what can you do about it?

Boosting your data quality

We at Prolific have banned our fair share of malicious accounts, and we’ve learned a thing or two along the way. The list below is not exhaustive, but provides some practical advice for designing your study and screening your data that will boost your confidence in the responses you collect. If this seems overwhelming to you, then don’t worry! We are doing a lot of work on our side to improve the quality of the pool, and ultimately you shouldn’t encounter many untrustworthy participants.

Busting the bots

  1. Include a catpcha at the start of your survey and prevent bots from even submitting answers. Equally, if your study involves an unusual interactive task (such as a cognitive task or a reaction time task), then bots should be unable to complete it convincingly.
  2. Include open-ended questions in your study (e.g., “What did you think of this study?”). Check your data for low-effort and nonsensical answers to these questions. Typical bot answers are incoherent and you may see the same words being used in several submissions (see this blog post for more information and examples).
  3. Check your data for random answering patterns. There are several techniques for this, such as response coherence indices or long-string analysis (see Dupuis et al., 2018).
  4. If you’re looking for a simpler solution: try including a few duplicate questions at different points in the study. A human responder will provide coherent answers, whereas a bot answering randomly is unlikely to provide the same answer twice.

As we’ve already said, we have technological measures in place to prevent bots so you’re extremely unlike to find any in your dataset.

Lassoing the liars

  1. Do not reveal in your study title or description what participant demographics you are looking for – it may give malingerers the information they need to lie their way into your study.
  2. Re-ask your prescreening questions at the beginning of your study (and at the end too if it’s not burdensome). This allows you to confirm that your participants' prescreening answers are still current and valid, and may reveal liars who have forgotten their original answers to prescreening questions.
  3. Ask questions relating to your prescreeners that are difficult to answer unless the participant is being truthful. That is, if you need participants taking anti-depressants, ask them the name of their drug and the dosage. In most cases, it’d be possible for a liar to look up an answer to these questions… but in our experience, most liars can’t be bothered to invest the extra effort!

Again, we constantly analyse the answer sets of our participants looking for unusual combinations, impossible answers and other tell-tale signs of malingering.

Cracking down on cheats

  1. Use speeded tests or questionnaires to prevent participants from having time to google for answers.
  2. Ask participants a few questions clarifying the instructions of the task at the end of the experiment (to check they understood the task properly and didn’t cheat inadvertently)
  3. Develop precise data-screening criteria to classify unusual behaviour: these will be specific to your experiment but may include:
  4. Variable cutoffs based on inter-quartile-range
  5. Fixed cutoffs based on ‘reasonable responses’ (consistent reaction rimes faster than 150ms, or test scores of 100%)
  6. Non-convergence of an underlying response model
  7. Simple as it seems, it’s been suggested you have a free-text question at the end of your study: “Did you cheat?”

However you clean your data, we strongly recommend that you preregister your data-screening criteria to increase reviewer confidence that you have not p-hacked.

Sniffing out slackers

  1. Use speeded tasks and questionnaires to prevent participants from having time to be distracted by the TV or the rest of the internet.
  2. Ask participants a few questions clarifying the instructions of the task at the end of the experiment (to check they read them properly)
  3. Collect timing and page-view data:
  4. Record the time of page load page load and timestamp every question answered.
  5. Record the number of times the page is hidden or minimised.
  6. Monitor the time spent reading instructions.
    Look for unusual patterns of timing behaviour: Who took 3 seconds to read your instructions? Who took 35 minutes to answer your questionnaire, with a 3 minute gap between each question?
  7. Implement attention checks (aka Instructional Manipulation Checks or IMCs). These are best kept super simple and fair. “Memory tests” are not a good attention check, nor is hiding one errant attention check among a list of otherwise identical instructions!
  8. Include open-ended questions that require more than a single word answer. Check these for low-effort responses.
  9. Check your data using careless responding measures such as consistency indices or response pattern analysis, see Meade and Craig (2012) and Dupuis et al. (2018).

Getting the best out of the good guys

One of the most important factors in determining data quality is the study’s reward. A recent study of Mechanical Turk participants concluded that fair pay and realistic completion times had a large impact on the quality of data they were willing to provide. On Prolific, it’s vital that trust goes both ways, and properly rewarding participants for their time is a large part of that. We enforce a minimum hourly reward of 5.00 GBP. But depending on the effort required by your study, this may not be sufficient to foster high levels of engagement and provide good data quality. Consider:

  1. The participant reimbursement guidelines of your institution. Some universities have set a minimum and maximum hourly rate (to avoid undue coercion). You might also consider the national minimum wage as a guideline (in the UK, this is currently £7.83 for adults).
  2. The amount of effort required to take part in your study: is it a simple online study, or do participants need to make a video recording or complete a particularly arduous task? If your study is effortful, consider paying more.
  3. How niche your population is: if you’re searching for particularly unusual participants (or participants in well-paid jobs), then you will find it easier to recruit these participants if you are paying well for their time.

That said, paying more isn’t always a good idea! Consider that:

  1. Studies with particularly high rewards may bias your sample, as participants may feel ‘forced’ to choose that study when they might have gone to others. This may particularly apply to participants with a low socio-economic status
  2. Participants sometimes share study information on external websites. If word gets out about a particularly well-paid study with niche inclusion criteria, you may attract liars.
  3. Bonus payments contingent on performance may make participants nervous about being paid, and lead to cheating.

Error-Free, Confusion-Clear and Engaged

While participants are ultimately responsible for the quality of data they provide, you as the researcher need to set them up to do their best.

  1. Pilot, pilot, pilot your study's technology. Run test studies, double and triple check your study URL, ensure your study isn’t password-protected or inaccessible. If participants encounter errors they will, more often than not, plough on and try to do their best regardless. This may result in unusable or missing data. Don’t expect your participants to debug your study for you!
  2. Make sure you use the ‘device compatibility’ flags on the study page if your study requires a specific (or excludes a specific) type of device. Note that currently our device flags do not block participants from entering your study on illegible devices (detecting devices automatically is somewhat unreliable and may exclude eligible participants). If you need stricter device blocking, then we recommend you implement it using your survey/experimental software.
  3. Keep your instructions as clear and simple as possible. If you have a lot to say, split it across multiple pages: use bullet points and diagrams to aid understanding. Make sure you explicitly state what a participant is required to do in order to be paid. This will increase the number of participants that actually do what you want them to!
  4. If participants message you with questions, aim to respond quickly and concisely. Be polite and professional (it’s easy to forget when 500 participants are messaging you at once that each one is an individual!). Ultimately participants will respond much better when treated as valuable scientific co-creators. 🙂
  5. If you can, make your study interesting and approachable. Keep it easy on the eye and break long questionnaires down into smaller chunks.
  6. If you can, explain the rationale of your study to your participants. There is evidence that participants are willing to put more effort into a task when its purpose is made clear, and that participants with higher intrinsic motivation towards a task provide higher quality data.

What to do if you think your data quality is compromised?

Firstly, talk to us. We will always ban participants using bots or lying in their prescreeners. You should reject submissions where you believe this to have occurred, and send us any evidence you have gathered. We’re on high alert right now in light of recent data quality issues on MTurk, and data quality is our top priority, so please reach out to us if you have any concerns, queries or suggestions.

In cases of cheating or slacking, we ask that you give participants some initial leeway. If they’ve clearly made some effort or attempted to engage with the task for a significant period of time but their data is not of sufficient quality, then consider approving them, but excluding them from your analysis. If the participant has clearly made little effort, failed multiple attention checks or has lied their way into your study, then rejection is appropriate. Please read our article on valid and invalid rejection reasons for more guidance.

Finally, if you found this blog post helpful, then watch this space over the next few months for more advice on how to make the most out of your research on Prolific.

Show Comments