Hajime Takeda - Introduction to Causal Inference with Machine Learning | SciPy 2024

Learn how causal machine learning reveals true cause-effect relationships in your data. Discover meta-learners, uplift modeling, and practical applications across industries.

Key takeaways

Causal inference with machine learning aims to understand causality and determine cause-effect relationships, unlike traditional ML which focuses on correlation-based predictions
Randomized Control Trials (RCT) are the gold standard for causal inference but aren’t always feasible due to ethical, time, or cost constraints
Two main techniques for causal machine learning:
- Meta-learners (S-learner, T-learner, X-learner, DR-learner, DML)
- Uplift modeling (decision tree-based approach)
Meta-learners use machine learning models to:
- Predict outcomes for treated and untreated groups
- Measure treatment effects at individual and group levels
- Handle complex, high-dimensional data
Key treatment effect metrics:
- Average Treatment Effect (ATE) - overall effect across population
- Conditional Average Treatment Effect (CATE) - effect within specific segments
- Individual Treatment Effect (ITE) - effect at individual level
Model validation approaches:
- Accuracy metrics (RMSE, cross-validation)
- Refutation methods (random common cause, placebo treatment)
- Sensitivity analysis
Two major libraries for causal ML:
- EconML (Microsoft) - broader range of algorithms
- CausalML (Uber) - focused on marketing and uplift modeling
Common applications include:
- Marketing (coupon effectiveness)
- Healthcare (treatment outcomes)
- Economics (job training programs)
- Social interventions
Confounding variables must be identified and controlled to avoid biased results (Simpson’s Paradox)
Recommended approach: start with simpler methods (S-learner/T-learner) before moving to more complex approaches

Hajime Takeda - Introduction to Causal Inference with Machine Learning | SciPy 2024

More talks