Build Trust with Your Users Workshop | Brian Love & Mike Ryan | ng-conf 2023

Build trust with your users by setting realistic Service Level Objectives (SLOs) that balance innovation velocity and reliability with Polaris AI.

Key takeaways
  • SLOs (Service Level Objectives) provide a common language for business and engineering teams to agree on the desired level of service reliability.
  • Aspirational SLOs are based on business needs, while achievable SLOs are based on past performance.
  • Incentivizing reliability can lead to reliability hell, where the cost of improving reliability is too high.
  • The goal of Polaris is to help build apps that can achieve high reliability without sacrificing innovation velocity.
  • Setting SLOs too high can lead to its own set of problems, such as conserving error budget, but not actually improving reliability.
  • Polaris AI is a tool that can help teams prioritize reliability and detect issues before they become critical.
  • Setting SLOs is a shared goal between business and engineering teams.
  • Service Level Indicators (SLIs) are metrics that measure one aspect of a service’s reliability.
  • SLIs can be used to set SLOs and to measure the performance of a service.
  • Quantifying happy users is a challenge, as it is difficult to measure happiness directly.
  • Reliability is a shared goal that many different parts of an organization must work towards.
  • The number one goal is reliability, and it is essential to have it in place before worrying about other aspects.
  • Over-engineering and trying to make sure everything is perfect can lead to complexity and loss of velocity.
  • The 99.6% figure for pacemakers is an aspirational target, not achievable in real life.
  • SLOs should be realistic and achievable, rather than aspirational.
  • There will always be some degree of variation, and the goal is to minimize it.
  • Error budget is the amount of errors that can be tolerated before action needs to be taken.
  • The error budget can be calculated based on the SLO and the proportion of valid requests.
  • The goal is to strike a balance between shipping new features and keeping things stable.
  • Polaris AI is designed to help developers and operators find the right balance between new features and stability.
  • When setting SLOs, it’s essential to consider the impact on users, as they may not care about the details, but they will feel the pain of low reliability.
  • Speaking the same language is crucial in setting SLOs, as it ensures everyone is on the same page.
  • There are many ways to measure reliability, and it’s essential to find the right metrics for each service.
  • The goal is to measure reliability in real-time, rather than waiting for issues to arise.
  • SLOs can help teams prioritize reliability and detect issues before they become critical.
  • SLOs should be set based on business needs and past performance.
  • Achievable SLOs are based on past performance, while aspirational SLOs are based on business needs.
  • It’s essential to find a balance between shipping new features and keeping things stable.