Hans Korving - SHAP beyond the standard graphics: co-design of ML-models in earth sciences

Learn how combining SHAP values with UMAP, clustering & stakeholder-friendly visualizations can make ML models in earth sciences more interpretable & impactful.

Key takeaways
  • Co-design with stakeholders is essential when developing ML models to ensure they are both accurate and trusted, especially in earth sciences applications

  • Standard SHAP visualizations, while useful for ML experts, often fail to resonate with stakeholders and domain experts who struggle to connect them with real-world context

  • Combining UMAP dimensionality reduction with SHAP values helps create more interpretable visualizations by projecting high-dimensional data into 2D space

  • HDBSCAN clustering on UMAP-projected SHAP values creates meaningful groups that stakeholders can better understand compared to clustering on raw feature values

  • Converting model insights into human-readable if/then rules based on raw feature values (rather than SHAP values) makes the results more relatable to stakeholders’ domain knowledge

  • Incorporating stakeholder feedback and domain constraints into model development leads to better trust and adoption, even if it means sacrificing some model performance

  • Visualizing results through maps and relating clusters to physical locations helps stakeholders validate model behavior against their real-world experience

  • Simple, interpretable models (like linear regression with interactions) are often preferable to complex black-box models that require extensive explanation

  • Iterative refinement and continuous stakeholder feedback throughout the modeling process helps bridge the gap between ML capabilities and practical applications

  • The approach was successfully applied to wildfire susceptibility prediction in Italy and ecological quality assessment of streams in the Netherlands