"Avoiding the Pitfalls of Autoscaling with Constant Work" by David Grizzanti (Strange Loop 2022)

Learn how to avoid the pitfalls of autoscaling with constant work in this talk, exploring concepts such as reliability, scaling, and anti-fragility to prevent cascading failures and achieve better system resilience.

Key takeaways
  • David Grizzanti talks about the pitfalls of autoscaling with constant work
  • Constant work is doing the same work regardless of changes in load and demand
  • Avoids cascading failures by preventing sudden spikes in work or errors
  • Hardware, software, and human errors are types of faults that can lead to failures
  • Anti-fragile systems are those that can withstand shocks and stress and even become more resilient over time
  • Concepts of scaling, reliability, and constant work are key ideas in the talk
  • Cascading failures can be prevented by limiting the number of requests each instance can receive and scaling up instead of down
  • Avoiding auto-scaling based on the assumption of failure can also help prevent cascading failures
  • Promtheus can be used as an example of a system that follows a constant work model
  • Anti-fragility can be achieved by building in buffers, redundancy, and ability to withstand stress and errors
  • Scaling up to the minimum and maximum size of servers and instances can help prevent cascading failures
  • Push and pull models for scaling can be used in different systems to achieve better reliability and fault tolerance