The approach to handling missing data in clinical trials has evolved over the past twenty years, particularly regarding methods for incorporating missing data to produce more comprehensive results. Issues surrounding missing data are of particular importance, due to the risks of introducing bias and losing statistical power, creating inefficiencies and detecting false positives (Type I Error).
The International Conference on Harmonisation (ICH) E9 guideline (1998) addresses the complexities of missing data and admits that there is no ‘gold standard’ to handle missing data, due to unique study designs and varying measurement characteristics. The ICH guidelines suggest sensitivity analysis and missing data handling to be predefined in the protocol, and any reasons for study withdrawal to be recorded, to protect trial robustness for dealing with missing data through the course of the clinical trial.
Complexities surrounding missing data in clinical trials has been highlighted in ICH guidelines in recent years. An example of this is the R1 Addendum to E9 (2017), which summarises the precise definition of estimands and handling of intercurrent events, which greatly affects drug submissions to regulators. The addendum highlights that it is essential for missing data challenges and proposed methodologies to be understood and addressed according to the chosen estimand at the time of trial design, to protect the validity of the results obtained from the trial.
Furthermore, the approach to combat the missing data should reflect the estimand, which in turn depends on the study objective, as per the definition above. Statisticians can provide input in the trial design and conduct, including patient retention strategies and overall analysis, to help prevent missing data in clinical trials. Having a deeper understanding of these qualities will improve study design in future studies as a result. Additionally, the use of historical data through Bayesian study designs would allow the identification of patterns in missing data and suggest plausible approaches.
For repeated measures analyses which are commonly used in clinical trials, many techniques can be defined at the time of writing the Statistical Analysis Plan (SAP) to populate missing values. Examples of this include:
In clinical research, the LOCF technique is often employed as a method for populating missing values. This technique is simple to perform with a small chance of error, but has been criticised for painting an incomplete picture of the safety and efficacy profile for each patient which could lead to ambiguity of final results.
A more promising but complex approach for handling missing data which has been suggested to produce more reliable results, therefore enhancing trial validity, is Multiple Imputation.
Multiple imputation is a statistical procedure for handling missing data in clinical research, with the aim of reducing the bias and complications associated with missing data. Multiple imputation involves the creation of multiple datasets, where the missing data are imputed with more realistic values through the use of the non-missing data. As a result, the uncertainty around what the real value might be is reduced, by imputing data randomly from a distribution.
Rubin (1987) developed a method for multiple imputation whereby multiple datasets are created, then individually analysed using standard statistical methods, following which the results are combined to produce an a final estimate and confidence intervals. Analyses based on multiple imputation should produce a result which reflects the true population estimate, while adjusting for the uncertainty of the missing data.
Rubin suggests the following method:
Standard Multiple Imputation performs the imputations such that the results for the patient with the missing data trends towards the mean for their associated treatment group, due to the weakening of the within subject correlation. This also results in an increase in variance as time progresses, as is expected in clinical trials.
The realization that the subjects who withdraw are no longer on randomized treatment lead to developments to allow imputation based on a clinically plausible post-withdrawal path. One of these is Multiple Imputation under the Copy Reference (CR) assumption, where post-withdrawal data is modelled assuming that the subject was a member of the reference group. Here, the outcome would tend towards the mean for the reference group.
In a case study examined to investigate Multiple Imputation in clinical research, active and placebo treatments were compared (at Weeks 2, 4, 6 and 12 of the trial) in adolescents with acne. The primary endpoint was the number of lesions at Week 12. In this study, drop-outs and withdrawals were common. Factors believed to affect the propensity to have missing data were identified as age, experiencing side effects, and experiencing a lack of efficacy. Through this information, it was understood that missing data patterns were likely to differ between the groups.
It is common for datasets of this type to be analysed using an Analysis of Covariance (ANCOVA) of last observation carried forward data. MI methods can be programmed using PROC MI in SAS Version 9.3 offering an alternative method to deal with missing data.
Below, we explore the Multiple Imputation process, compare results with LOCF ANCOVA and Mixed Models Repeated Measures (MMRM) methodologies and ask: is Multiple Imputation worth the effort?
A simulation of 1000 data sets was carried out by removing data randomly from a completer dataset (N=131) using propensity scores based on the pattern of missing data observed in the full dataset (N=153). Least Squares (LS) means and differences were estimated, along with the Standard Error (SE). Boxplots present the bias and relative SE from Multiple Imputation compared to LOCF ANCOVA and a MMRM approach without imputation of data; these are relative to the ANCOVA on the completer dataset.
The least biased of several methods of MI tested was Predictive Mean Matching (PMM), which imputes values by sampling from k observed data points closest to a regression predicted value where the regression parameters are sampled from a posterior distribution. The total variance of combined ANCOVA results (see Figure 1) is calculated from the average within-imputation (W) and between-imputation variance (B). [1], [2]
Figure 1: Flow chart of Multiple Imputation Process
MMRM was identified as the least biased and LOCF the most biased of the three methods (Figure 2). Relative SEs were demonstrated to be greatest for PMM (Figure 3).
Figure 2: Bias in LS Means of Estimate
Figure 3: Relative standard error of difference in treatment means
Both figures show the distribution from 1000 simulations (data were removed randomly based on propensity scores; the propensity model included age, side effect of pain after treatment and efficacy measured by lesion counts). Bias and relative standard errors are relative to the completer dataset.
The Food and Drug Administration (FDA) were critical of the use of LOCF in Phase 3 clinical trials, as this method assumes no trend of response over time, resulting in bias and a distorted covariance structure. All methods in PROC MI and MMRM make the assumption that data are Missing at Random (MAR). However, PROC MI has useful functionality in summarising the missing data patterns.
When using Multiple Imputation, it can be complex to define a priori as there are many details to consider (see Figure 1) and additional data processing steps are necessary. The PMM method of imputation has the greatest advantage over alternative MI methods, in that no bounds, rounding or post-imputation manipulation is required to give plausible imputed lesion counts. Sensitivity analyses can investigate a range of delta (δ) values added to imputed values to explore the robustness of conclusions to imputation.
Relative SEs were generally greater than 1 for all methods, which is to be expected given the loss of approximately 15% of data from the completer dataset by using the propensity scores in the simulation of missing values. The SEs from MI techniques incorporate an additional component (B) to account for the uncertainty in the imputation, whereas LOCF ignores this uncertainty. However, the resulting SE from MI is appreciably larger than that from MMRM, and thus this Multiple Imputation method has less statistical power.
Multiple Imputation is complex to define and a computationally intensive process, thus requiring substantial benefits to be worth the effort for including in primary analysis. PMM was identified to have less power than MMRM without reducing bias.
Therefore, we recommend MMRM as the primary analysis, along with PROC MI to investigate the sensitivity (delta method) and avoiding LOCF. Further work could investigate scenarios such as data not being MAR, varying k and whether the default burn-in of 20 in PMM is sufficient.
In terms of implementation when handling missing data, a number of resources are available, including the MICE (Multiple Imputation by Chained Equations) R package and missingdata.org which includes SAS macros, test data sets and imputation techniques for a wide variety of endpoints from the Drug Information Association Scientific Working Group (DIA SWG). If you are looking for a more in-depth academic resource there is a great book published “The Prevention and Treatment of Missing Data in Clinical Trials” in 2010 which has been made available for free by the publishers.
Quanticate's statistical consultants are among the leaders in their respective areas enabling the client to have the ability to choose expertise from a range of consultants to match their needs. If you have a need for biostatistical consultancy please Submit a RFI to speak to a member of our team..
References
[1] SAS/STAT(R) 12.1 User's Guide, "The MIANALYZE Procedure, Combining Inferences from Imputed Data Sets," [Online]. Available:
http://support.sas.com/documentation/cdl/en/statug/65328/HTML/default/viewer.htm#statug_mianalyze_details08.htm.
[2] D. Rubin, Multiple Imputation for Nonresponse in Surveys, New York: John Wiley & Sons, 1987.