1 Intro
- Quick intro
- Example R code for common survival analysis methods. Covers 3 of the most common methods used in survival analysis: log-rank test and Kaplan Meir plots for comparing survival curves between groups, and Cox proportional hazards regression model.
- logrank test for comparing survival curves between groups
2 How to interpret hazard ratios
- https://journals.asm.org/doi/full/10.1128/aac.48.8.2787-2792.2004
- The “hazard ratio” is the most common effect size used in survival analysis. Briefly, it compares the risk of having the event occur in the next instant between groups
3 Regression for time-to-event outcomes
- https://www.nature.com/articles/s41592-022-01689-8
- Intro to the Cox proportional hazard model (most common regression model for survival data)
- Also introduces the accelerated failure time model (AFTR). Although less commonly used , this model has a nice interpretation that uses “time ratios”: it compares the ratio of the average time of event in group 1 to the average time of event in group 2.
4 Machine learning
Here are 2 popular machine learning models used for time-to-event outcomes:
- Regularized Cox model (i.e. a modified version of the Cox model that performs automatic feature selection)
- Random survival forest. Perhaps you’ve heard of “random forests” before, it is one of the most popular machine learning methods that has been shown to work well across a wide variety of applications.
The C-index can be used to assess the predictive accuracy of a survival model. It is similar to the AUROC in that it ranges from 0 to 1, with 0.5 indicating no predictive ability and 1 indicating perfect predictive ability. Most survival model R functions will calculate the C-index for you, but if not, there are a few R packages for calculating it and comparing C between different models: survcomp, compareC, survC1