"The Evolution of a Planetary-scale Distributed Database" by Kevin Scaldeferri (Strange Loop 2022)

Learn how a planetary-scale distributed database evolved over 9 years, leveraging Kubernetes, S3, Kafka, and cellular architecture to simplify complex query solutions and ensure scalability.

Key takeaways

Switched to Kubernetes deployment 9 years ago due to detail on one slide
Edge data pushed into storage layer, treating S3 as primary storage
Designed a system that reuses successful architectural patterns
Cells are like massively multi-tenant, scale independently, distribute workloads, carry their own failover and failback mechanisms
Stateful systems still hard, but significant progress made
Problem of one sort or another in a particular cell? Allow us to retire a cell and replace it
We want customers to be able to query all of their data, not just a subset of it
Wanted to treat data very differently, but knew how, wanted to use S3, wanted to go back to something like stock
Our Kafka installation is getting bigger and bigger, need to decouple ingest and query
Solution: use Kafka, use S3, build metadata store, build query router, build live query workers
Made a decision to embark on a pair of couple changes, but to simplify some of these solutions
Cellular architecture great approach for trying to cap size of any tier
Realized we could reuse some of these solutions to S3 gallery
Still have data that’s coming in through the same end points that customers have always used
Our system has evolved significantly over time, lessons learned along the way
Want to make sure we don’t put two giant customers into the same cell
Made a decision to move away from decentralizing storage and move towards centralized storage
Building next generation of metadata system to help with data distribution

"The Evolution of a Planetary-scale Distributed Database" by Kevin Scaldeferri (Strange Loop 2022)

More talks