Raphael Tamaki - Almost Perfect: A Benchmark on Algorithms for Quasi-Experiments | PyData Amsterdam

Explore how common quasi-experiment algorithms perform in real-world scenarios, examining bias, confidence intervals, and key factors for successful implementation.

Key takeaways
  • Most quasi-experiment algorithms perform similarly in terms of bias, with predictions often being wrong by more than 20%

  • Combining multiple models shows limited benefits due to high correlation between model predictions

  • Using more granular data and features significantly improves model performance, reducing errors by approximately 30%

  • The double robust estimator performed best at capturing the true treatment effect within confidence intervals

  • Synthetic control with linear regression showed strong performance for granular data, but required careful attention to weight constraints

  • Model performance is highly data set dependent - no single algorithm consistently outperforms others across all scenarios

  • Confidence intervals are crucial - some models can appear unbiased but have impractically large uncertainty ranges

  • Key algorithmic approaches evaluated included:

    • Difference in difference
    • Synthetic control
    • Meta learners (S-learner, T-learner)
    • Double robust estimator
    • Graph causal models
  • When implementing quasi-experiments, focus should be on:

    • Using granular data where possible
    • Carefully considering model constraints and assumptions
    • Evaluating both bias and confidence interval coverage
    • Validating results across multiple approaches