A Comparison of Statistical Methods for Time-To-Event Analyses in Randomized Controlled Trials Under Non-Proportional Hazards.

Journal: Statistics In Medicine
Published:
Abstract

While well-established methods for time-to-event data are available when the proportional hazards assumption holds, there is no consensus on the best inferential approach under non-proportional hazards (NPH). However, a wide range of parametric and non-parametric methods for testing and estimation in this scenario have been proposed. To provide recommendations on the statistical analysis of clinical trials where non-proportional hazards are expected, we conducted a simulation study under different scenarios of non-proportional hazards, including delayed onset of treatment effect, crossing hazard curves, subgroups with different treatment effects, and changing hazards after disease progression. We assessed type I error rate control, power, and confidence interval coverage, where applicable, for a wide range of methods, including weighted log-rank tests, the MaxCombo test, summary measures such as the restricted mean survival time (RMST), average hazard ratios, and milestone survival probabilities, as well as accelerated failure time regression models. We found a trade-off between interpretability and power when choosing an analysis strategy under NPH scenarios. While analysis methods based on weighted logrank tests typically were favorable in terms of power, they do not provide an easily interpretable treatment effect estimate. Also, depending on the weight function, they test a narrow null hypothesis of equal hazard functions, and rejection of this null hypothesis may not allow for a direct conclusion of treatment benefit in terms of the survival function. In contrast, non-parametric procedures based on well-interpretable measures like the RMST difference had lower power in most scenarios. Model-based methods based on specific survival distributions had larger power; however, often gave biased estimates and lower than nominal confidence interval coverage. The application of the studied methods is illustrated in a case study with reconstructed data from a phase III oncologic trial.