'25 at 25': Software to help people implement our methods

13 Jun 2024

The MRC Clinical Trials Unit at UCL has a proud history of methodology research, which aims to find new solutions to challenges faced in clinical research. Over the past 25 years, we have pioneered novel clinical trial designs, overcome challenges to the statistical analysis of trial data, and found new ways to improve the quality of meta-analyses.

We want clinical trials and studies within and beyond the MRC CTU at UCL to benefit from our methodology research. To encourage the wider research community to implement our new methods, we make statistical code openly available to use in software packages like Stata and R.

nstage

One of the Unit’s biggest methodological achievements was the development of the multi-arm multi-stage (MAMS) trial design. Unlike traditional designs, MAMS trials allow researchers to test multiple new treatments at the same time. Only those treatments which show sufficient promise at the end of each stage will progress to the next stage, and additional arms can be added to the trial as new treatments become available.

Our researchers developed the nstage software package to help researchers calculate the required sample size for MAMS trials testing time-to-event outcome measures (i.e. how long it takes for an event, such as death or disease progression, to occur).

A paper describing the software and statistical methodology was published in 2011 in the journal Trials. The code is available in Stata.

Using details of the trial design and operating characteristics inputted by the user, nstage calculates the required sample size for each stage of the trial. It also tells the user how many events need to occur in the control arm in order to trigger an analysis of the data from that stage.

More recent updates to nstage include the ability to calculate a single, headline sample size that meets the desired operating characteristics for the full MAMS trial. This is particularly helpful for triallists at the planning stage and when applying for funding.

The team have also expanded the nstage suite to include nstagebin, which calculates the required sample size of MAMS trials with binary outcomes instead of time-to-event outcomes, and nstagebinopt, which finds the most efficient MAMS designs.

In total, researchers have downloaded nstage packages almost 9,000 times, with a current rate of 150 downloads per month.

ART

For other trial designs besides MAMS, the ART (Analysis of Resources for Trials) software can compute sample size or statistical power for either time-to-event (artsurv) or binary (artbin) outcome measures.

ART is very flexible, and was designed to account for the complexities encountered in the real world. The package allows for staggered participant entry, loss of participants to follow-up, and changes in treatment, including crossover between different trial arms. It can use a variety of statistical tests and also supports non-inferiority designs, where the trial aims to find out if an experimental treatment is not unacceptably less effective than the control.

The mathematics underpinning ART were developed in the mid-1990s. The team released the package on Stata in 1997 and since then packages from the ART family have been downloaded nearly 8,500 times. A methods paper followed in 2006, published in Statistics in Medicine and selected as part of a series celebrating the journal’s 25^th anniversary.

metan

Meta-analysis is a statistical method which combines results from multiple independent studies that test the same treatment or intervention, aiming to estimate an overall treatment effect.

The metan package allows users to conduct a meta-analysis of summary binary or continuous data, or from estimates of treatment effects in each independent study. It was originally released in 1998 by Michael Bradburn, Jon Deeks and Doug Altman, and has since become the main meta-analysis command in Stata contributed by the research community.

In 2018, as the original metan code had not been updated for several years, the MRC CTU at UCL team released an updated version called admetan, which fixed longstanding bugs and introduced new features. These include random effects options, where the observed treatment effects can vary across studies because of real differences in the treatment effect in each study, on top of any differences due to sampling. The package also introduced a separate programme for drawing forest plots to present meta-analysis results.

In 2020, with the blessing of the original authors, these features of admetan were merged into the previous code to form metan version 4. In total, admetan plus metan version 4+ have been downloaded over 165,000 times, and metan currently sees around 3,500 downloads per month.

If individual participant data (IPD) are available from each study, a wider range of analysis becomes possible. In particular, the meta-analyst has greater control over data quality and choice of statistical model. “Two-stage” IPD meta-analysis is an approach whereby a statistical model is fitted to the data from each study in turn, and then the results are combined using standard meta-analysis methods for summary data.

In 2015, researchers from the MRC CTU at UCL released the ipdmetan package to implement this approach. It has been downloaded over 75,000 times in total, and currently sees around 1,500 downloads per month.

jomo

A frequent challenge for analysis of any medical data is missing data. This can occur if trial participants do not attend all scheduled visits or do not complete case record forms in full. Research teams may also lose contact with some participants during the trial and so they become lost to follow-up.

Most studies have some missing data, and this can reduce the reliability and interpretability of their results.

Multiple imputation is a tool for replacing missing data with plausible values. Our researchers developed the jomo package for multi-level joint modelling multiple imputation of missing data.

Jomo uses this technique to fill in missing values by using the distribution of missing data, given the observed data, to generate multiple sets of plausible values. It can handle either continuous, binary or categorical data. Its key feature is allowing for multi-level data such as patients grouped in general practices.

Jomo is available in R and since 2014 it became required for a few other R packages, including probably the most popular multiple imputation package, mice. This led to increasing use and downloads. Since 2015, the jomo package has been downloaded almost 2 million times.

Further information: