We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Actionable Observability - Lesley Cordero - NDC London 2024
Learn actionable observability strategies to empower technologists, prioritize issue resolution, and optimize monitoring and alerting for proactive incident management and data-informed decision-making.
- Actionable observability is about empowering technologists to make high-impact work with the right data and skills, shifting from monitoring to debug application issues.
- Observability is the ability to understand the internals of your software systems, providing insights into problem debugging and resolution.
- Incident management is a process of understanding incident response, prioritizing issues, identifying the root cause, debugging and addressing problems.
- Monitoring and alerting are essential components of observability, focusing on proactive prevention, incident detection, and automated responses to avoid toil.
- Key considerations for monitoring and alerting: prioritization, automation, automation levels, notification strategies, and automation scope.
- Service level metrics (SLOs) are used to measure SLI, including latency, error rates, request distribution, and throughput.
- Monitoring should be data-informed, focused on application reliability, user experience, and system understanding, leveraging automation to manage complexity and noise.
- Alerting strategies should consider: incident detection, notification thresholds, target detection, and alert escalation chains.
- Automation should focus: on high-priority tasks, delegating low-priority tasks to human technicians, and streamlining workflows using monitoring and alerting tools.
- Data analysis should consider: observability, monitoring, and alerting contexts to avoid data noise and provide actionable insights for improvement.
- Product-contextual monitoring is crucial to adapt monitoring and alerting to product-related incidents, prioritizing product-user relationships.
- Organizations should prioritize: observability, monitoring, and alerting, investing in automated monitoring, and empowering teams for data-driven decision-making.