Olivier Grisel - Predictive survival analysis with scikit-learn, scikit-survival and lifelines

Discover predictive survival analysis techniques with scikit-learn, scikit-survival, and lifelines libraries, covering time-to-event regression, Kaplan-Meier estimator, and more.

Key takeaways
  • Predictive survival analysis is used to estimate the survival function or the probability of survival over time, given a set of predictive features.
  • The concept of time-to-event regression is used to model the time until an event occurs, and is applicable to various fields such as medicine, insurance, and finance.
  • The Kaplan-Meier estimator is a measure of the survival function at a specific time point, and is widely used in survival analysis.
  • Censoring is a common problem in survival analysis, where some individuals do not experience the event of interest, and can be dealt with by using the Kaplan-Meier estimator.
  • The hazard rate is a measure of the speed of failure or the rate at which failures occur over time, and is used to model the time-to-event.
  • The concordance index is a measure of the quality of a model, and is used to evaluate the ability of a model to predict the correct order of events.
  • Non-linear feature engineering can be used to mitigate some of the limitations of traditional survival analysis methods.
  • The Brier score is a proper scoring rule for a classification model, and is used to evaluate the quality of a model.
  • Predictive survival analysis can be used in various fields such as medicine, insurance, and finance to model the likelihood of an event occurring over time.
  • The lifelines library is a Python library that provides a variety of tools for survival analysis, including the Kaplan-Meier estimator and the concordance index.
  • The scikit-survival library is a Python library that provides a variety of tools for survival analysis, including the Kaplan-Meier estimator and the concordance index.
  • The Cox proportional hazards model is a widely used model in survival analysis, and can be used to model the time-to-event while accounting for the effects of covariates.
  • Enriched feature sets can be used to improve the performance of a model, and can be created by combining different types of features.
  • Time-dependent ROC curves can be used to evaluate the performance of a model over time.
  • The Israel index is a measure of the quality of a model, and is used to evaluate the ability of a model to predict the correct order of events.