František Kaláb - How Data Improves Your Life in Prague [PyData Prague #21]

Learn how Prague's data platform Golemio aggregates city data from transport, waste & climate to enable smarter decisions, improve services & increase transparency for citizens.

Key takeaways
  • Prague Data Platform (Golemio) serves as a central data hub aggregating data from various city systems, including public transport, parking, waste management, and climate measurements

  • The platform aims to make data-driven city decisions more transparent through:

    • Real-time data integration
    • Open APIs for developers
    • Public data portal (data.praha.eu)
    • Open source code and tools
  • Key data sources include:

    • Real-time public transport positions and delays
    • 7,000 smart waste bins with ultrasound sensors
    • Parking spot occupancy
    • Bicycle and pedestrian traffic counts
    • Microclimate measurements
  • Technical stack consists of:

    • Postgres as main database
    • DuckDB for efficient data processing
    • DBT for data transformations
    • Parquet files for historical data storage
    • Azure Blob Storage for data lake
    • Node.js and RabbitMQ for real-time integration
  • Platform serves multiple stakeholder groups:

    • City decision makers
    • Public transport operators
    • Third-party developers
    • Citizens and general public
  • The initiative has helped break down data silos between city institutions, encouraging data sharing and collaboration

  • Data is used for both operational purposes (real-time monitoring) and strategic planning (schedule optimization, infrastructure changes)

  • Focus on cost-efficiency through use of open source technologies and public cloud storage