Nina van Diermen - BERTopic to accelerate Ukrainian aid by the Red Cross

Learn how the Red Cross used BERTopic to process Ukrainian refugee social media messages, reducing analysis time from 20 hours to 2 minutes through AI-powered automation.

Key takeaways
  • BERTopic was implemented to help the Red Cross process Ukrainian refugee social media messages, reducing analysis time from 20 hours to 2 minutes

  • Key requirements for the system:

    • Quick initialization for emergency response
    • Dynamic adaptation to changing situations
    • Ability to handle unstructured text data
  • Technical implementation includes:

    • Transformation of messages into numerical embeddings
    • Dimensionality reduction using UMAP
    • Clustering with HDBSCAN
    • Topic label generation based on word frequency
  • System benefits:

    • No predefined labels required
    • Handles outliers effectively
    • Provides topic hierarchies
    • Can generate topic summaries using OpenAI
    • Allows tracking topic evolution over time
  • Evaluation methods:

    • Topic coherence measurement
    • Density-based cluster validation
    • Soft reformulation accuracy
    • Human interpretability of results
  • Customization options:

    • Replaceable embedding models
    • Adjustable clustering methods
    • Configurable vectorizer settings
    • Optional stop word removal
    • Custom summary generation
  • Practical applications:

    • Social media monitoring
    • Self-help FAQ generation
    • Aid resource allocation
    • Trend analysis
    • Document categorization