Observability and the Paradox of Alerts
SLOs and observability are the key to a safe, sane on call rotation that is not severely life-impacting. This session will share how to get there.
What you'll learn
How many different paging alerts wake up your team from how many different systems? Many teams have too many paging alerts for them to usefully manage today, and most teams are hurtling toward an unsustainable future. It is a paradox of scale: The bigger and more complicated your systems get, the fewer paging alerts you should have. The good news is that most teams who move from a monitoring model to observability are able to delete about 90% of paging alerts while increasing reliability from the customer’s perspective. SLOs and observability are the key to a safe, sane on call rotation that is not severely life-impacting — a rotation your senior engineers will be proud to join, not one staffed by everyone who isn't yet influential enough to get out of it. In this session, Charity Majors will tell you how to get there.