Patterns of Distributed Systems • Unmesh Joshi & James Lewis • GOTO 2024

Distributed systems can be tricky, with leader election stalling and consistency vs. availability trade-offs. Learn from the experts about Raft, Paxos, and TrueTime, and discover how to overcome common issues with patterns and hands-on practice.

Key takeaways
  • Raft’s leader election can lead to stalling due to a fundamental issue.
  • Many distributed systems, such as Kafka and Amazon S3, can have similar issues with leader election.
  • The CAP theorem is often misunderstood, but it’s crucial to understand how distributed systems balance consistency, availability, and partition tolerance.
  • Paxos is a well-known consensus algorithm used in many systems, including BerkeleyDB, Spanner, and etcd.
  • Raft is an evolution of Paxos, aiming to make it more practical and efficient.
  • Paxos describes how nodes can achieve consensus on a single value, while Raft extends this to leader election and committee formation.
  • Clocks play a vital role in distributed systems, where determinism is crucial. Lamport clocks are used to track time and synchronize clocks in distributed systems.
  • Consistency and availability are trade-offs in distributed systems. Systems can prioritize consistency over availability, but this may lead to higher latency.
  • TrueTime is a timing system used in Spanner to ensure clocks are synchronized within 7 milliseconds.
  • Pseudocode and open-source frameworks can help learn about distributed systems. Code can be inspected and run to understand the inner workings of distributed systems.
  • The patterns approach is essential for learning about distributed systems. Implementing patterns from scratch can help solidify understanding.
  • Understanding distributed systems requires practice and hands-on experience. workshops and code exercises can be used to teach and learn about distributed systems.