-
Course
- Core Tech
Implementing Site Reliability Engineering (SRE) Reliability Best Practices
Site Reliability Engineering is the implementation of efficient DevOps. This course will teach you the theory and practice of SRE in the real world. It also explains in detail the incident response and change management processes.
What you'll learn
Site Reliability Engineering is the implementation of efficient DevOps. In this course, Implementing Site Reliability Engineering (SRE) Reliability Best Practices, you’ll learn to implement Site Reliability Engineering best practices. First, you’ll explore managing incident response, which is a vital part of service management. Next, you’ll discover the steps to set up an efficient change management process. Finally, you’ll learn how to identify the best solutions for several common technical issues such as DNS, load balancing, health checks, and distributed consensus. When you’re finished with this course, you’ll have the skills and knowledge of Site Reliability Engineering needed to effectively manage your application or service.
Table of contents
- Module Overview | 1m 36s
- SRE Overview | 2m 38s
- Designing an Effective on-call System | 5m 17s
- Understanding Managed Vs. Unmanaged Incidents | 3m 34s
- Building and Implementing an Effective Postmortem Process | 4m 44s
- Learning the Tools and Templates for Postmortems | 2m 9s
- Demo | 6m 59s
- Summary | 1m 44s
About the author
Passionate about IT Ops, Karun has 20+ years of hands on experience with Linux, Cloud tech, Monitoring and Log aggregation. He enjoys creating learning materials that are engaging and provide immediate practical value.
More Courses by Karun