Featured resource
pluralsight tech forecast
2025 Tech Forecast

Which technologies will dominate in 2025? And what skills do you need to keep up?

Check it out
Hamburger Icon
  • Course
    • Libraries: If you want this course, consider one of these libraries.
    • Core Tech

Culturing Resiliency with Data: A Taxonomy of Outages

This talk provides an overview of the categorization of outages that happened in Uber in the past few years based on root cause types.

Gremlin - Pluralsight course - Culturing Resiliency with Data: A Taxonomy of Outages
by Gremlin

What you'll learn

This talk provides an overview of the categorization of outages that happened in Uber in the past few years based on root cause types. We'll start with some background information, including definitions, incident management framework, and existing preventive techniques, aka best practices. Followed by details and rationale around individual categories, sub-categories, and their relative distribution. Then we'll deep dive into two of the biggest categories: deployment and capacity with a focus on time series based data ming techniques to assist detection and simulation of some of the common root causes. Finally, we'll discuss the propagation of lessons learned in terms of policy and process changes based on these insights.

Table of contents

About the author

Gremlin - Pluralsight course - Culturing Resiliency with Data: A Taxonomy of Outages
Gremlin

Gremlin's enterprise Chaos Engineering platform makes it easy to build more reliable applications in order to prevent outages, innovate faster, and earn customer trust.

More Courses by Gremlin