A Short Summary of the Last Decades of Data Management • Hannes Mühleisen • GOTO 2024

Explore the evolution of data management, from SQL's resilience to NoSQL's transformation, OLTP vs OLAP architectures, and emerging trends in database technology and AI integration.

Key takeaways
  • Relational databases and SQL have proven remarkably resilient and continue to dominate data management after 50+ years

  • Key-value stores, document databases, graph databases and other “NoSQL” alternatives are increasingly being absorbed back into relational systems

  • The fundamental split in database architectures is between transactional (OLTP) and analytical (OLAP) workloads, which have different optimization requirements

  • Distributed “big data” systems often introduce more complexity and costs than running on modern powerful single nodes - scaling up can be better than scaling out

  • Tables as a data organization concept predate written text by ~1000 years and remain a fundamental way to structure information

  • MongoDB, Cassandra and other NoSQL databases have gradually re-added SQL, schemas and ACID properties as developers struggled without them

  • DuckDB represents a new generation of analytical databases optimized for modern hardware and in-process execution

  • Vector databases and AI embeddings are likely to be absorbed into relational systems rather than remaining separate specialized databases

  • Application developers should generally not have to deal directly with storage, schemas and consistency - that complexity belongs in the database

  • SQL and relational databases continue to evolve and add capabilities while maintaining their core strengths of declarative queries and ACID guarantees