Stratified Survival Analysis

Introduction

singleEventSurvival() supports two stratifiers:

If both are requested, the package fits each stratifier separately. It does not build joint strata such as Female, 65+.

Example Data

library(OdysseusSurvivalModule)

survivalData <- data.frame(
  subject_id = 1:12,
  time = c(20, 35, 42, 50, 63, 70, 74, 85, 91, 105, 118, 140),
  status = c(1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0),
  age_years = c(34, 37, 41, 48, 52, 58, 61, 66, 71, 73, 77, 82),
  gender = c("Female", "Male", "Female", "Male", "Female", "Male", "Female", "Male", "Female", "Male", "Female", "Male")
)

Gender Stratification

fitGender <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "km",
  strata = "gender"
)

names(fitGender)
fitGender[["overall"]]$summary
fitGender[["gender=Female"]]$summary
fitGender[["gender=Male"]]$summary
fitGender$logrank_test_gender

The log-rank output is stored in logrank_test_gender.

Age-Group Stratification

fitAge <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "km",
  strata = "age_group",
  ageBreaks = list(c(18, 49), c(50, 64), c(65, Inf))
)

names(fitAge)
fitAge[["age_group=18-49"]]$summary
fitAge[["age_group=50-64"]]$summary
fitAge[["age_group=65+"]]$summary
fitAge$logrank_test_age_group

Age-group labels are generated automatically from ageBreaks.

Using Both Stratifiers

fitBoth <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "km",
  strata = c("gender", "age_group"),
  ageBreaks = list(c(18, 49), c(50, 64), c(65, Inf))
)

names(fitBoth)
fitBoth$logrank_test_gender
fitBoth$logrank_test_age_group

In this case the result includes:

Extract Stratum-Specific Curves

femaleCurve <- fitGender[["gender=Female"]]$data
olderCurve <- fitAge[["age_group=65+"]]$data

head(femaleCurve)
head(olderCurve)

Plot Separate Strata

plot(
  fitGender[["gender=Female"]]$data$time,
  fitGender[["gender=Female"]]$data$survival,
  type = "s",
  col = "firebrick",
  xlab = "Time (days)",
  ylab = "Survival probability",
  ylim = c(0, 1),
  main = "Gender-specific Kaplan-Meier curves"
)

lines(
  fitGender[["gender=Male"]]$data$time,
  fitGender[["gender=Male"]]$data$survival,
  type = "s",
  col = "steelblue"
)

legend(
  "topright",
  legend = c("Female", "Male"),
  col = c("firebrick", "steelblue"),
  lty = 1,
  bty = "n"
)

Stratified Cox and Parametric Fits

The same stratification behavior applies to other models.

fitCox <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "cox",
  covariates = c("age_years"),
  strata = "gender"
)

fitWeibull <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "weibull",
  covariates = c("age_years"),
  strata = "age_group",
  ageBreaks = list(c(18, 49), c(50, 64), c(65, Inf))
)

fitCox[["gender=Female"]]$summary
fitWeibull[["age_group=65+"]]$summary

Summary

For stratified analysis, the main points are:

  1. Use strata = "gender", strata = "age_group", or both.
  2. Read stratum-specific results from named list entries.
  3. Use logrank_test_gender and logrank_test_age_group for between-group tests.
  4. Treat gender and age-group results as separate stratified analyses, not joint cells.