CAPtivating architecture: Navigating Distributed Systems and Microservices by Alexandros Charos

Security

Learn key patterns for building reliable distributed systems, from timeout configuration and transaction approaches to service boundaries and preventing cascading failures.

Key takeaways

Timeouts and retry policies are critical but often misconfigured - studies show 31% of errors come from missing timeouts and 47% from incorrect timeout values
When implementing distributed transactions, consider two main approaches:
- Choreography: Services react to events independently
- Orchestration: Central service coordinates the workflow
Key factors for choosing between choreography vs orchestration:
- Choreography works well with existing event-driven architectures
- Orchestration is better for complex flows and maintaining visibility
- Orchestration can lead to tighter coupling between services
Implement idempotency to handle retries safely:
- Request fingerprinting on server side
- Client request IDs
- Hash functions to detect duplicate requests
When calculating proper timeout values:
- Measure 99th percentile response times
- Consider both connection and read timeouts
- Add buffer time for critical services
- Include jitter in retry logic
Key considerations for microservice boundaries:
- Code volatility (frequency of changes)
- Fault tolerance requirements
- Security boundaries
- Scalability needs
- Domain contexts
Circuit breaking and rate limiting are essential for preventing cascading failures
Document distributed workflows carefully to avoid creating unmaintainable “event-driven mud”
Total system availability decreases multiplicatively with each dependent service
Consider organizational readiness and operational capabilities before implementing distributed architectures

CAPtivating architecture: Navigating Distributed Systems and Microservices by Alexandros Charos

More talks