Introduction to survival (time-to-event) analysis

survival analysis

This is a collection of introductory resources on survival time-to-event analysis that I often share with students and colleagues who are new to the topic.

Author

Jaron Arbet

Published

January 4, 2025

Intro

Quick intro
Example R code for common survival analysis methods. Covers 3 of the most common methods used in survival analysis: log-rank test and Kaplan Meir plots for comparing survival curves between groups, and Cox proportional hazards regression model.
logrank test for comparing survival curves between groups

How to interpret hazard ratios

https://journals.asm.org/doi/full/10.1128/aac.48.8.2787-2792.2004
The “hazard ratio” is the most common effect size used in survival analysis. Briefly, it compares the risk of having the event occur in the next instant between groups

Regression for time-to-event outcomes

https://www.nature.com/articles/s41592-022-01689-8
Intro to the Cox proportional hazard model (most common regression model for survival data)
Also introduces the accelerated failure time model (AFTR). Although less commonly used , this model has a nice interpretation that uses “time ratios”: it compares the ratio of the average time of event in group 1 to the average time of event in group 2.

Machine learning

Here are 2 popular machine learning models used for time-to-event outcomes:

Regularized Cox model (i.e. a modified version of the Cox model that performs automatic feature selection)
Random survival forest. Perhaps you’ve heard of “random forests” before, it is one of the most popular machine learning methods that has been shown to work well across a wide variety of applications.

The C-index can be used to assess the predictive accuracy of a survival model. It is similar to the AUROC in that it ranges from 0 to 1, with 0.5 indicating no predictive ability and 1 indicating perfect predictive ability. Most survival model R functions will calculate the C-index for you, but if not, there are a few R packages for calculating it and comparing C between different models: survcomp, compareC, survC1