Should you track DORA metrics for mobile apps?

Within the last few years, using the four key DORA metrics has become one of the most popular ways of tracking software delivery performance. As always, trends and popularity don’t come without their downsides — cargo cults and people looking for silver bullets.

We often meet mobile developers whose boss has tasked them with tracking DORA metrics. They’ve looked into it, but tracking DORA metrics for a mobile app doesn’t seem to make sense. If this is you, you’re not wrong.

What quick iteration means in the context of mobile apps

The basic goal of DORA metrics is simple: make it possible to iterate as quickly as possible without breaking things. What does this mean in the context of world-class mobile apps?

Aiming for a steady release cadence. Releasing a new mobile app version to users requires time and effort. App Store and Google Play reviews take time. End users do not install updates right away. Fixing bugs is slow (and costly), which means doing manual QA before your releases makes sense. And so on.

This cost of release means that it’s not feasible to release to App Store or Google Play after every change. In practice, most high-performing mobile app development teams have a one or two-week release cadence.

Shipping a prerelease version for internal/beta users. As releasing a new version to production is costly, high-performing mobile teams use a prerelease version that is built for every change (e.g. pull request) and shipped to internal/beta users automatically. This allows the team to get real user feedback and maximize their chances of finding quality issues before they reach production.

Using feature flags. For high-performing mobile app teams, future features are already in the production app today — just behind a feature flag. It can take a long time for users to install the latest mobile app version. For some features, rolling them out only makes sense once enough people have installed the version that shipped the feature.

Feature flags allow features to be released independently of the app release cycle. They also allow quick rollbacks when a piece of code you shipped has a bug.

When we look at these best practices of world-class mobile development teams, we realize they don’t fit in with how the key DORA metrics are defined and measured. A weekly release cadence will prevent you from reaching “elite” deployment frequency and change lead time. Measuring a high-level aggregate like change failure rate and mean time to recovery in a complex environment where only a portion of your users ever install a bad version and where some features are shipped using feature flags is a lot of work — for very little gain.

Instead of trying to adopt measures that were developed as part of a DevOps study, you should focus on practices designed for high-performing mobile app development teams:

Invest in your ability to keep a steady release cadence. Track (1) planned releases that you missed because the app wasn’t releasable and (2) the number of hotfixes you had to release outside of your release cycle because you broke something and noticed it too late. (3) Invest in automated testing to help with both (1) and (2).
Make sure you can create a prerelease version as effortlessly as possible. This makes it possible to work in smaller increments (as in smaller PRs if you’re using GitHub), which speeds up internal processes like code review. Having an active internal/beta release channel also ensures you get feedback from new features earlier and helps you catch hard-to-test bugs.
Track the adoption of app versions and features. Version adoption is important for deciding when to release a new feature. Feature adoption is crucial for your most important goal: shipping value to your customers and the business.

What if my boss really, really wants me to calculate DORA metrics?

Oh well. Maybe you can still ask them what mobile app development team they consider “elite” — and then use this simple calculator to show them that, according to DORA metrics even they’re not elite?

If that doesn’t work either, here’s how you calculate DORA metrics for your mobile app development process in a way that provides you with useful insights:

Use an internal/beta release channel for tracking DORA metrics. You should create an internal release for every change (e.g., a pull request) that has passed CI and QA checks. If you have multiple internal release channels, use the one deployed automatically to as big an audience as possible that does not require app review (e.g., internal/beta users). Ideally, the level of automation and QA of the internal release should be on the level that allows you to turn it into App Store/Google Play release with one click.
Measure your DORA metrics against the internal/beta release channel. You should calculate deployment frequency and change lead time based on the changes released to the internal/beta release channel. Also, track the change failure rate and time to restore service that appears in these releases the same way you track normal change failures since they escaped your normal review/QA process.
Track your full (Google Play/App Store) change lead time with a separate environment. Keeping track of a separate “app-store” environment ensures you don’t focus only on the internal process but maintain your public release cadence. You should pay special attention to weeks when you had to cancel a production release because the app wasn’t in a releasable state.
Make sure you can drill down on your change lead time. You should split your change lead time metric to at least “in progress,” “waiting for review/QA,” “waiting for merge,” and “waiting for deployment” parts. This lets you easily see which part of your development process is the bottleneck. In the beginning, it may be “waiting for deployment,” but once you a great internal/beta environment and continuous deployment tooling in place, the bottleneck shifts to “review/QA” or “in progress.”

Tracking your DORA metrics this way will make them more in line with the intent of the DORA research (and make them more comparable with web/cloud teams, if that’s a part of your goals). Most importantly, it allows your mobile development team to improve their development process and develop their continuous delivery capabilities (including the level of test automation).

Conclusion

If you really, really, really want to track DORA metrics for mobile apps, you can do it. For most mobile teams, it doesn’t make sense. If you decide to do it, ensure you can track it on the level of individual changes (e.g., pull requests) — and not just for production releases that happen less frequently.

Your mobile engineering teams are better off tracking metrics designed for mobile app development than the four key DORA metrics. Investing in your ability to follow a regular release cadence, shipping features to your users reliably with feature flags, and ensuring you work in reasonably sized increments will make your teams much more high-performing than the “elite” DORA status.

Huge thanks to Antti Kosonen at Wolt and Jaakko Ylinen at Yle for taking the time to review an early draft of this post and providing invaluable feedback that helped shape the final version.

Should you track DORA metrics for mobile apps?

What quick iteration means in the context of mobile apps

What if my boss really, really wants me to calculate DORA metrics?

Conclusion

More content from Swarmia

How I use Swarmia as a product manager

Fast, good, cheap: With automated testing, you can pick all three