A Short Summary of the Last Decades of Data Management • Hannes Mühleisen • GOTO 2024

Hannes Mühleisen

Explore the evolution of data management, from SQL's resilience to NoSQL's transformation, OLTP vs OLAP architectures, and emerging trends in database technology and AI integration.

Key takeaways
  • Relational databases and SQL have proven remarkably resilient and continue to dominate data management after 50+ years

  • Key-value stores, document databases, graph databases and other “NoSQL” alternatives are increasingly being absorbed back into relational systems

  • The fundamental split in database architectures is between transactional (OLTP) and analytical (OLAP) workloads, which have different optimization requirements

  • Distributed “big data” systems often introduce more complexity and costs than running on modern powerful single nodes - scaling up can be better than scaling out

  • Tables as a data organization concept predate written text by ~1000 years and remain a fundamental way to structure information

  • MongoDB, Cassandra and other NoSQL databases have gradually re-added SQL, schemas and ACID properties as developers struggled without them

  • DuckDB represents a new generation of analytical databases optimized for modern hardware and in-process execution

  • Vector databases and AI embeddings are likely to be absorbed into relational systems rather than remaining separate specialized databases

  • Application developers should generally not have to deal directly with storage, schemas and consistency - that complexity belongs in the database

  • SQL and relational databases continue to evolve and add capabilities while maintaining their core strengths of declarative queries and ACID guarantees