DjangoCon Europe 2023 | The Inevitable Tech Incident: The Lessons We Just Can't Seem to Learn

Understand human psychology and organizational learning to improve incident response and avoid blame, learn from inevitable tech incidents and adapt to changing environments.

Key takeaways
  • It is important to understand human psychology and how it affects our response to incidents. We often think in terms of individual blame rather than organizational learning.
  • Incidents are inevitable, and it is essential to establish a culture of transparency and learning to avoid blame and improve incident response.
  • It is crucial to have clear definitions and ownership of incidents, and to ensure that the right people have the right tasks assigned to them.
  • Monitoring and alerting tools should be designed with simplicity and clear thresholds to avoid confusing or overwhelming developers.
  • Incidents should be documented and shared openly to facilitate learning and improve incident response.
  • The concept of “level of analysis” is important, as it influences how we look at things and what we learn from incidents.
  • The leader’s attitude towards blame culture and learning is crucial in shaping the organization’s response to incidents.
  • Two fundamental things: level of analysis and psychology. Level of analysis is about how we look at things, psychology is about what we do with what we see.
  • Incidents are a natural part of the process, and it is essential to learn from them.
  • The key to incident response is understanding, trust, and open communication.
  • Even with clear monitoring and alerting tools, there may be false positives, so it is essential to have a clear plan for dealing with them.
  • When something goes wrong, stop and look at what happened, and think about how to prevent it from happening again.
  • It is crucial to have clear responsibilities assigned to team members, and to ensure that they have the necessary skills and resources to complete their tasks.
  • The sooner you stop and start investigating, the better your chances are of fixing the problem quickly.
  • Don’t be afraid to ask questions and seek help when things go wrong.
  • The two main things are: what we learn and how we learn it.