Every engineering leadership blog, conference talk, and "State of DevOps" report will tell you that DORA metrics matter. Deployment frequency, lead time for changes, mean time to recovery, change failure rate — these four metrics are the gold standard for measuring software delivery performance.
And they're right. DORA metrics matter.
But here's what nobody talks about: most engineering managers don't have a platform team to instrument these metrics for them. Most of us are running teams of 5-12 engineers, shipping features to production, and our "platform team" is one person who also does backend work and reluctantly maintains the CI pipeline.
So when someone says "you should be tracking DORA metrics," the practical question is: how?
I've been tracking these metrics manually for years, across teams that definitely did not have observability platforms or DevOps dashboards. Here's how to actually do it.
What Are DORA Metrics (Quick Refresher)
DORA (DevOps Research and Assessment) identified four key metrics that predict software delivery performance and organizational outcomes:
- Deployment Frequency (DF): How often you deploy to production. Elite teams deploy on demand (multiple times per day). Low performers deploy less than once per month.
- Lead Time for Changes (LT): Time from code commit to code running in production. Elite teams measure this in less than a day. Low performers take more than six months.
- Mean Time to Recovery (MTTR): When something breaks in production, how long does it take to restore service? Elite teams recover in less than an hour. Low performers take more than a week.
- Change Failure Rate (CFR): What percentage of deployments cause a failure in production? Elite teams are under 5%. Low performers are above 46%.
These four metrics together give you a remarkably complete picture of your team's delivery health. High deployment frequency with low change failure rate means you're shipping fast and safely. Long lead times with low deployment frequency usually means your process has bottlenecks — big PRs, slow reviews, manual QA gates.
Why EMs Should Care (Not Just Platform Teams)
Here's the thing that bugs me about how DORA metrics are discussed: they're almost always framed as an organization-level or platform-level concern. "Install this observability tool." "Build a deployment dashboard." "Get your DORA metrics into your CI/CD pipeline."
That's great for companies with dedicated platform teams. For the rest of us, it's useless advice.
But DORA metrics are more valuable at the team level, not less. Here's why:
They make invisible problems visible. Your team's deployment frequency dropped from daily to twice a week? That's worth investigating. Maybe PRs are getting bigger. Maybe the test suite is slow. Maybe someone's blocking on reviews. Without the metric, you might not notice for months.
They give you ammunition for leadership conversations. "I need to invest a sprint in CI/CD improvements" is a hard sell. "Our lead time has increased 3x in two months, and I have a plan to cut it in half" is a much easier conversation.
They connect process to outcomes. If you change your review process or branching strategy, DORA metrics tell you whether it actually helped. Without them, you're guessing.
They help you spot burnout early. A team that's deploying 3x per day with a 30% change failure rate is probably firefighting constantly. That's not sustainable, and the metrics will show it before the resignations do.
How to Track DORA Metrics Manually
You don't need fancy tooling. You need a spreadsheet (or a simple database) and about 15 minutes per week. Here's my approach:
Deployment Frequency
What to track: Count of production deployments per week.
Where to find it: Your CI/CD tool (GitHub Actions, GitLab CI, Jenkins) has deployment logs. Count the number of successful production deployments per week. If you use feature flags, count flag enables that expose new functionality to users.
Manual approach: Every Friday, I check our deployment logs and record the count. Takes two minutes.
What to watch for: Sudden drops in frequency (process bottleneck), or very high frequency with rising change failure rate (shipping too fast without safety nets).
Lead Time for Changes
What to track: Median time from first commit on a branch to that code running in production.
Where to find it: This is the trickiest one to track manually. My approach: sample 5-10 merged PRs per week. For each one, note the first commit timestamp and the deployment timestamp. Take the median.
Shortcut: If your PRs are typically merged and deployed same-day, just track "time from PR opened to PR merged" as a proxy. It won't capture the deploy step, but if you deploy on merge (which you should), it's close enough.
What to watch for: Lead time creeping up usually means PRs are getting bigger, reviews are taking longer, or there's a bottleneck in your pipeline.
Mean Time to Recovery (MTTR)
What to track: For each production incident, time from "we know it's broken" to "service is restored."
Where to find it: Your incident log. You are keeping an incident log, right? If not, start one. Every time something breaks in production, note: when was it detected, when was service restored, what was the root cause, what was the fix.
Manual approach: After every incident, I add a row to our incident tracker with timestamps. At the end of each month, I calculate the average and median recovery time.
What to watch for: Rising MTTR usually means increasing system complexity without corresponding investment in observability and runbooks.
Change Failure Rate
What to track: Percentage of deployments that cause an incident, rollback, or hotfix.
Where to find it: Cross-reference your deployment count with your incident log. If you deployed 20 times this month and 3 deployments caused incidents, your CFR is 15%.
Manual approach: Tag each incident in your log with "was this caused by a deployment?" Then divide by total deployments.
What to watch for: CFR above 15% consistently means you need better testing, smaller deployments, or both.
The 15-Minute Weekly Routine
Here's my actual weekly routine for tracking these metrics:
1. Friday afternoon, 15 minutes:
- Count this week's production deployments (2 min)
- Sample 5 PRs, note lead times (5 min)
- Review any incidents from the week, note timestamps (5 min)
- Update the spreadsheet, check trends (3 min)
2. Monthly (30 minutes):
- Calculate monthly averages
- Compare to previous month
- Write a one-paragraph summary: "What changed and why?"
- Share with the team (transparency builds trust)
That's it. No dashboards. No observability platforms. Just a spreadsheet and a weekly habit.
What "Good" Looks Like (Be Honest With Yourself)
The DORA research defines four performance tiers: Elite, High, Medium, and Low. Most teams I've worked with fall somewhere between Medium and High. And that's fine.
The goal isn't to hit Elite on day one. The goal is to know where you are, pick one metric to improve, and make progress. If your deployment frequency is twice a month and you get it to weekly, that's a massive improvement — even if "Elite" teams deploy multiple times per day.
Be honest about where you are. The metrics are only useful if they reflect reality.
Common Patterns I've Seen
High DF, High CFR: You're shipping fast but breaking things. Invest in testing and smaller batch sizes. This is usually a team that adopted CI/CD without adopting the supporting practices (feature flags, canary deploys, automated tests).
Low DF, Low CFR: You're shipping safely but slowly. Your process probably has too many gates. Look at review bottlenecks, manual QA stages, and PR size. The goal is to ship just as safely but more frequently — smaller batches are actually safer, not riskier.
Rising Lead Time: Usually means PRs are getting bigger, or there's a review bottleneck. Try setting a soft limit on PR size (< 400 lines of meaningful changes) and ensuring reviews happen within 24 hours.
MTTR Spikes: After a complex incident, MTTR will spike. That's expected. If it stays high, you need better runbooks, better observability, or both.
How emkit Will Automate This
Everything I described above is manual, and it works. But it's also the kind of thing that falls off when you get busy — which is exactly when you need the metrics most.
emkit will connect to your existing tools (GitHub, GitLab, Jira, PagerDuty) and calculate DORA metrics automatically. No platform team required. You'll see trends over time, correlated with team health and 1:1 themes, so you can connect "deployment frequency dropped" with "three team members mentioned being blocked on reviews this month."
That's the power of treating engineering management as a connected system instead of isolated spreadsheets.
But don't wait for emkit to start tracking. Open a spreadsheet today. Spend 15 minutes this Friday. You'll learn something about your team that you didn't know.