The Power of Observability: A Tale of Merging, Scaling & DevSecOps • George Aspirtakis • GOTO 2024

Learn how a company merged two observability tools, achieved a flat hierarchy, and empowered developers to reduce downtime, detect issues, and make data-driven decisions.

Key takeaways
  • Merging observability tools: Dustin merged two companies, each with their own observability tools, to create a single platform.
  • Flat hierarchy: The company has a flat hierarchy, making it easier for developers to understand and contribute to the observability tool.
  • Importance of observability: Observability is crucial for understanding what’s happening in the system, detecting issues, and preventing downtime.
  • New Relic as observability tool: The company uses New Relic as their observability tool, which provides a single pane of glass for monitoring and understanding the system.
  • Domain separation: The company separates domains into different teams, each responsible for their own code and observability.
  • MVP approach: The company takes an MVP approach to development, deploying code and collecting data in parallel to reduce risk.
  • Proactive monitoring: The company is proactive in monitoring and detecting issues, rather than reactive.
  • Empowering developers: The company empowers developers to take ownership of their code and domains, and trusts them to make decisions.
  • Peak periods: The company has a large online presence and experiences peak periods, such as Black Friday, which require scaling and proactive monitoring.
  • Alerting and notification: The company uses alerting and notification systems to inform teams of issues and prevent downtime.
  • Data-driven decision making: The company uses data to inform decision making, rather than relying on intuition or anecdotal evidence.
  • Separation of concerns: The company separates concerns into different domains, making it easier to understand and troubleshoot issues.
  • Harmonization: The company harmonized their observability tools after the merger, creating a single platform for monitoring and understanding the system.