Ankur Ankan - Introduction to Causal Inference using pgmpy | PyData Amsterdam 2024

Python

Learn about causal inference using PGMPY: Discover DAG & potential outcomes frameworks, causal discovery algorithms, evaluation metrics & real-world applications in PyData talk

Key takeaways

There are two main frameworks for causal inference: potential outcomes framework and directed acyclic graphs (DAGs)
Causal discovery is challenging because multiple causal graphs can represent the same observed data, making it difficult to determine the true causal relationships
The DAG framework requires significant manual intervention and expert knowledge to build accurate models, especially for identifying confounders and colliders
PGMPY provides tools for:
- Causal discovery algorithms (PC, Hill Climb)
- Export knowledge integration
- Parameter estimation
- Testing implied conditional independencies
- Simulation capabilities
Common evaluation metrics include:
- Fisher’s C-test
- Correlation score
- Structure score
- F1 score-based metrics
When choosing between potential outcomes vs DAG framework:
- Use potential outcomes for estimating single causal effects
- Use DAGs for broader causal discovery and understanding mechanisms
- Consider combining both approaches when possible
Key challenges in causal inference:
- Lack of ground truth data
- Difficulty in handling reverse causality
- Sensitivity to algorithm parameters
- Need for large datasets
- Problems with highly correlated variables
The field is actively evolving with new methods and approaches being developed regularly
Applications span multiple domains including epidemiology, economics, social sciences, and machine learning
Integration of expert knowledge and automated methods (like LLMs) can help improve model accuracy

Ankur Ankan - Introduction to Causal Inference using pgmpy | PyData Amsterdam 2024

More talks