26 DevOps KPIs and metrics: Guide to DORA progress
Learn DevOps KPIs such as DORA metrics—cycle time, deployment frequency, change failure rate, and mean time to recovery—for team success.
Mar 14, 2023 • 11 Minute Read
DevOps KPIs are the industry standard for evaluating the reliability and quality of software delivery within your organization, but they aren't the be-all-end-all. In tracking DevOps metrics, you're able to identify bottlenecks that are slowing down your team's delivery, causing failures in deployed code, but you need to know what steps to take next.
Metrics can provide insights that help you continuously improve and deliver better software and more value to your customers. We'll cover the four key DevOps KPIs (commonly known as DORA DevOps metrics) plus several other metrics to improve your team's performance. Plus, we'll show you how you can use Flow to turn metrics into actionable insights.
Table of contents
What are DevOps metrics?
Google’s DevOps Research and Assessment (DORA) team spent six years conducting surveys to study engineering teams and their DevOps processes. The group began publishing its findings in 2014 with the first State of DevOps report, and has continued to release yearly updates.
In the first report, the DORA team outlined four key metrics to track software development team performance, including:
Deployment frequency
Lead time for changes
Time to restore service
Change failure rate
These DORA metrics are widely adopted today, but at the time of their release, they were seen as a revolutionary set of new industry KPIs to build high-performing teams. Today, they are used as a base foundation for teams to obtain a core understanding of their efficiency.
Why are DevOps KPIs and metrics important?
DevOps KPIs can be used as a starting point to track performance by measuring how your team performs compared to industry standards. By observing these metrics, you’ll begin to form a clearer picture of how your team develops over time and in what areas they may need improvement.
DORA metrics allow you to gain insight into the two key predictors of successful engineering teams: throughput and stability. Throughput is the speed at which software is delivered to the end user, and stability is the reliability of that software to perform as expected without failure.
Throughput is measured by deployment failure and lead time for changes. You can use throughput to make data-driven decisions, relying on firm data to support your choices.
Stability is measured by time to restore service and change failure rate. These KPIs will help provide a better understanding of your system’s stability and how it impacts your users and organization.
The benefits of DevOps include helping your teams focus on the customer experience, simplifying the goals of each release to help your team move more efficiently, and encouraging responsibility within your team.
Exploring DORA DevOps metrics
DevOps metrics are designed to work holistically. Focusing on just one could impact the performance of another. As a whole, they provide a clear picture of team health and how well internal DevOps processes are working. In turn, leaders can use these metrics as a starting point to advocate for their team and improve customer experience.
Here's a breakdown of the four primary DevOps metrics, along with examples of how they affect your team’s performance:
1. Deployment frequency
Deployment frequency is a measure of how often code changes are released to production. In general, smaller or more frequent deployments pose less risk and put you in a state of continuous delivery.
Elite teams are able to perform on-demand deployments because software is in a constantly releasable state—and, ideally, deployed daily. Low-performing teams tend to produce large deployments over the span of months, which can impact velocity and increase the risk and impact of deployment failure.
How to improve: To increase your deployment frequency, shrink your deployment size. Rather than a large number of features and changes, your releases should be a single feature or change, and each update should be as small as possible.
2. Lead time for changes
Lead time for changes is the time it takes for a developer’s committed code to reach production. This metric serves as an early KPI of process issues and helps you pinpoint bottlenecks that are slowing down your software delivery.
An elite team takes less than an hour from when code is checked to when it’s deployed. A low-performing team can take more than six months to make and deploy changes.
How to improve: To reduce your lead time for changes, use software to help you identify if commits are stuck in waiting states, like waiting for QA testing. Once you discover how your testing process is delaying deployments, you can automate aspects of testing during production or hire additional QA testers to address these bottlenecks.
3. Mean time to restore service (MTTR)
Time to restore service, or mean time to recovery (MTTR), is a measure of how long it takes your team to recover from a failure in production. This KPI is huge from an operational perspective, as the quicker you can respond, the better the customer experience will be.
To measure the time to restore service, you need to know the timestamp of when the incident occurred and when it was resolved. You also need to know what deployment resolved the incident.
An elite team typically takes less than an hour to get services up and running again. A low-performing team tends to take more than six months to restore services.
How to improve: TTRS works hand in hand with deployment frequency—by reducing the size of your deployments, you can reduce the impact radius if something does go wrong.
A smaller deployment that fails is going to be easier to troubleshoot and restore compared to a larger deployment. If you have a long TTRS, this may also indicate you need better rollback systems to help you reverse a flawed deployment and quickly get back on your feet.
4. Change failure rate
Change failure rate is the percentage of deployments that result in a failure. This metric measures the stability and quality of the code your team is shipping. It’s calculated as a percentage of deployments that result in a severe service failure and require immediate remediation, such as a rollback or patch.
Oftentimes teams struggle initially with defining what a failure is. This definition may vary from company to company and even team to team. Failure in DevOps can be about code and it can also be about outcomes. Ideally, it should measure when a deployment results in degraded performance, but this is something that should be defined as a team.
How to improve: Deployments often fail because of deployment error, poor testing, or poor code quality. Human error is a leading cause of deployment errors, so implementing deployment automation can remove the human element and ensure that your code review process and requirements result in thorough, meaningful, and helpful code reviews.
If poor testing is the cause, consider automating testing. To improve code quality, revisit your code review process to ensure junior developers are learning from senior team members.
Additional DevOps metrics to track
While DORA metrics are powerful KPIs for engineering teams, we’ve highlighted a few more metrics that can help you build upon your DevOps foundations. Each metric provides an opportunity for your team to analyze its performance and make changes for improved growth.
Build and test metrics
Build metrics are focused on establishing a more efficient software development and test processes. Focusing on these KPIs can help your team deliver products to market with great speed and quality, creating a more positive customer and developer experience.
- Build failure rate: The number of errors that occur within a given timeframe, including bugs and production issues
- Code coverage: The amount of your codebase that has been executed by your test suite
- Defect escape rate: A measure of the number of bugs that escaped detection and are released into production
- Flaky test rate: How often a test fails intermittently, creating both passing and failing results for the same code
- Application usage and traffic: A measure of how often users interact with and use a software application or service
- Error rates: A percentage noting how often errors or failures occur within a system or software process
Customer experience metrics
By analyzing customer experience metrics, you can get a better idea of customer satisfaction with your product or service offering. Happier customers are more likely to return or recommend your company’s offerings to others.
- Customer churn rate: The rate at which customers choose to stop doing business with your company
- Customer satisfaction: How happy customers are with your company’s products, services, and experience
- Customer ticket volume: The number of support tickets or inquiries over a chosen period of time
- Net Promoter Score: A measure of customers’ overall loyalty and willingness to recommend your company
Deployment metrics
Analyzing deployment metrics showcases the deployment aspect of your development process. By better understanding these KPIs, your team can create a more efficient deployment process with successful pushes, resulting in less disruption and customer dissatisfaction.
- Change volume: The amount of code that's changed within a software deployment
- Cycle time: A measure of how long it takes your team to deliver once they start working on a task
- Deployment lead time: A measure of how long it takes to deploy a release into a testing, development, or production environment
- Deployment success rate: The percentage of deployments that are successful, not causing failures or disruptions
Operations metrics
These metrics highlight the performance and overall reliability of your software in production. Focusing on operations metrics and KPIs can create a better user experience and reduce the cost of development by diminishing the need to dedicate resources to fix issues.
- Incident resolution time: The average time it takes to resolve a reported issue
- Mean time between failures: A measure of the average time between repairable failures of a system or product
- Mean time to detection: The amount of time it takes on average to detect an issue or disruption
- System uptime: The average amount of time a system is able to run before it breaks; also known as mean time to failure (MTTF)
Security metrics
Monitoring security metrics is a vital requirement for any development team, as they help assess the security posture of your software and overall infrastructure. Poor security can significantly impact customer satisfaction and trust in your team’s products.
- Patch compliance rate: The percentage of a system that has been updated with the latest patches
- Security incident volume: The number of security incidents your team faces, such as hacking, malware, and breaches
- Vulnerability detection rate: Percentage of vulnerabilities identified via vulnerability scanning and penetration testing
- Vulnerability remediation time: The average time it takes to fix a vulnerability once identified
How to prioritize the right metrics for your team
The goal of tracking any of the metrics listed above is to maximize efficiency and improve delivery. While there’s no clear guideline for which particular KPIs you should track, the four DORA metrics are a great place to start as you begin scaling DevOps.
These metrics provide insight into the stability and throughput of your DevOps practices and can serve as a compass of sorts, pointing you in the direction of what can be improved. But remember, metrics are only a starting point, an indicator; it is up to your actions to turn metrics into actions that can transform your team.
It’s important to define a clear goal for why you’re tracking certain performance indicators. Rather than tracking every metric, consider the ones most likely to increase your business value. There’s a fine line between tracking metrics to make actionable improvements and tracking metrics for vanity’s sake.
DevOps KPIs: Challenges and solutions
DevOps metrics are powerful tools that can help your team move toward success and put your KPIs on a positive upward slope. You'll likely face some challenges in the process, but you can handle them with a bit of focus and proper knowledge. Here are the most common DevOps implementation challenges and how your team can leave them in the dust:
- Cultural resistance: For DevOps to succeed, your entire team needs to be on board. Communicate through education and training while involving them in the process.
- Data quality and consistency: Poor data can lead to inaccurate metrics, so take the time to standardize your data collection and implement routine data quality checks.
- Finding the right metrics: Choosing the wrong metrics without knowing why is like spinning your wheels in the mud—you won’t be going anywhere. Check that your metrics are linked directly to your goals and start small, growing as you progress.
- Lack of historical data: A lack of initial data can make it challenging to establish baselines. Focus on collecting data immediately and using industry benchmarks at the beginning of your journey.
- Tooling complexity: Using multiple DevOps monitoring tools can be complex, so choose compatible tools that integrate seamlessly within your existing process and offer automation features.
FAQ
You may have questions about DevOps KPIs, such as DORA metrics. Here are answers to the most frequently asked questions we receive.
What is the difference between a KPI and a metric?
Metrics and KPIs are similar in that they are used to measure performance, but they aren't quite the same thing.
Metrics use quantitative measurements (think numbers) and typically focus on a specific activity or process. On the other hand, KPIs are used to measure progress toward a particular goal; they are generally considered higher-level and interpret available metrics to help inform team decisions.
How do you measure the success of DevOps implementation?
Measure the success of DevOps implementation by analyzing DevOps metrics and KPIs. We recommend starting with DORA metrics; these four metrics—deployment frequency, lead time for changes, change failure rate, and mean time to recovery—can help you better understand your team's overall efficiency under a DevOps software development model.
Remember to set clear goals before mindlessly monitoring endless metrics; otherwise, you'll never know when your team has reached a milestone.
What are the six pillars of DevOps?
DevOps pillars provide a base framework for your software development lifecycle. They include:
- Collective responsibility: Everyone's shared responsibility in the software development process
- Collaboration and integration: A focus on fostering strong collaboration and communication between your teams
- Pragmatic implementation: An emphasis on setting a practical, realistic approach to DevOps
- Bridging compliance and development: Ensuring security compliance requirements are met while also focusing on the speed of the DevOps process
- Automation: Using automation to streamline delivery processes and reduce human error
- Measure, monitor, report, and action: The need to measure and monitor metrics to continually track progress and identify areas for improvement
Evolve your DevOps team with Pluralsight Flow
Pluralsight Flow allows you to track DORA metrics with the actionable metrics of Flow to help you reduce developer friction and accelerate delivery. Flow Retrospective reports include all four key DORA metrics to help you uncover patterns about your deployments and incidents and facilitate data-driven decision-making.
Learn more about how Pluralsight Flow can help you optimize the way you gather and use DevOps metrics by scheduling a demo.