We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Devoxx Greece 2024 - Kubernetes Resiliency by Chris Ayers
Learn how to design resilient and observable Kubernetes applications by setting baselines for resource requests, leveraging availability zones, and implementing monitoring, observability, and backup strategies.
Creating a Baseline for Kubernetes
- Set a baseline for resource requests and limits in Kubernetes applications
- Requests should be based on average usage, not minimum
Resiliency and Availability
- Availability zones are critical for resiliency in Kubernetes
- Use availability zones to spread workloads across multiple regions
- Haikus can be used to monitor availability
Monitoring and Observability
- Use metrics like CPU usage, memory usage, and queue lengths to monitor applications
- Leverage tools like Open Telemetry and distributed tracing for observability
Node and Resource Management
- Use node pools and resource requests to manage compute resources
- Limitations are crucial for resource management
- Use feature flags to manage rollout of new features and versions
Scaling and Autoscaling
- Use horizontal pod autoscalers to scale applications based on demand
- Leverage pod disruption budgets to handle scaling and autoscaling
Failure Domains and Rollback
- Identify failure domains in Kubernetes applications
- Use probes like liveness, readiness, and startup probes to detect failures
- Roll back deployments when failures occur
Backup and Disaster Recovery
- Plan for backup and disaster recovery in Kubernetes applications
- Use tools like Chaos Mesh for testing and validation
Monitoring and Testing
- Monitor applications and nodes in Kubernetes
- Load test applications to ensure they can handle demand
- Validate test results using metrics and logging