Yannis Moudere - Enhancing Event Analysis at Scale: Leveraging Tracking Data in Sports

Learn how to combine event and tracking data in sports analytics using scalable architecture, machine learning models, and efficient processing for deeper game insights.

Key takeaways
  • Data in football comes from two main sources: event data (discrete actions) and tracking data (continuous player/ball positions) at 25 frames per second

  • Event data requires context enhancement through tracking data analysis to provide meaningful insights about off-ball movements and player pressure

  • Key analytical models include:

    • Pitch control (player influence based on position/velocity)
    • Pass selection probability
    • Pass success probability
    • Expected position value (EPV)
    • Pressure analysis
  • Technical architecture leverages:

    • Horizontal scaling with spot instances for cost optimization
    • Message queues for asynchronous processing
    • Dead letter queues for error handling
    • Continuous integration/deployment pipeline
  • Processing requirements:

    • 250MB tracking data per game
    • 150k frames per game
    • 8GB RAM needed for 150 seconds per game
    • Total season storage ~95GB
    • Processing cost approximately $10 for 10,000 games
  • Data applications include:

    • Pre-game analysis
    • Post-game reports
    • Scouting dashboards
    • Performance metrics visualization
    • Player comparison tools
  • Machine learning integrates:

    • Convolutional neural networks
    • Kernel density estimation
    • Success probability modeling
    • Position value calculations
  • System is designed to be:

    • Cost-effective through spot instance usage
    • Fault tolerant with message queue backup
    • Scalable based on processing demand
    • Asynchronous for efficient processing