What are missing data?

Sometimes data that studies are supposed to collect about participants are not available. Studies do their best to prevent and minimise the problem, but missing data are still common. 

In clinical trials, missing data might arise in different ways. For example:

  • Study participants might not attend all scheduled visits, so their health outcome data are not available for some visits.
  • Participants might not complete all fields of a case record form, so some data on participant characteristics are not available.
  • Participants might not be contactable at some point during the trial and be lost to follow-up.

Why do missing data matter?

Missing data can substantially reduce the reliability and interpretability of results from trials. Having high levels of missing data can also be an indicator of poor trial conduct, affecting the integrity of the trials.

In randomised clinical trials, certain types of missing data can waste the advantages of randomisation. For example, if a treatment is effective but people whose health is worst are more likely to withdraw from the study, this may make the treatment appear less effective than it in fact is. However, because we do not see the missing data, we cannot tell if those who withdrew had worse outcomes or not. This poses a challenge for statistical analysis.

Our work on missing data

We have a long-standing history of developing and implementing methods for handling missing data in clinical trials as well as observational studies.

We have:

  • Worked with key stakeholders, including patient and public research partners and clinicians, to co-develop guidelines on how to reduce, handle, and report missing data in trials.
  • Contributed to a collaborative project on developing principles for handling end of participation events in trials (PeRSEVERE).
  • Provided software for different implementations of multiple imputation of missing data (and ice and mimix in Stata; jomo in R) and published tutorials on how to use the method in different settings.
  • Published research articles and book chapters on missing data methodology, motivated by real issues faced in our trials.

In 2023 members of our team co-authored the 2nd edition of the book ‘Multiple imputation and its application’.

We also run a yearly short course on multiple imputation that is popular among trial statisticians, epidemiologists, and PhD students.



Publications: Tutorials and guidelines