Sensible benchmarks for evaluating the effectiveness of your engineering organization

If you’re digging into engineering metrics for the first time, it can be hard to know what “good” looks like. The urge is strong to find a way to compare yourself to other companies using benchmarks, to know exactly where your teams sit on the distribution of excellent to terrible.

In reality, your best benchmark is your own baseline. Productivity, experience, and business metrics can vary greatly depending on a company’s size, age, and culture: your goal is continuous improvement, not trying to match companies you know little about.

Still, it’s useful to have a sense of what good might look like, so you can understand what’s possible and what’s concerning. To help you out, we’ve published engineering benchmarks that allow you to form realistic ambitions for a set of core engineering effectiveness metrics:

  • Engineering investment. Where are teams across the organization spending their time?
  • Flow efficiency. How long do issues sit idle once they’re started?
  • Batch size. How large is each change?
  • Lead time to change. How long does it take for a task to go from start to production?
  • Time to deploy. How long does it take for an approved change to reach production?

Why these metrics?

We’ve seen other approaches to engineering benchmarks fall short for two reasons: the list of metrics is too long, making it unclear where to begin; and the benchmarks encourage improvement far past the point of diminishing returns. (For instance, let’s say the typical coding time on a task is two hours. You could seek to reduce this in order to meet someone’s idea of “elite,” but it’s only worth it if everything else in the path to production consistently takes less than two hours.)

When we set out to identify “benchmarks,” we didn’t want to cast a ridiculously wide net or suggest unreasonable targets. Instead, we wanted to focus specifically on metrics that are:

  • Fundamental. These metrics encompass and inform other measures of effectiveness.
  • Table stakes. Achieving good standings in these metrics is crucial before venturing into more nuanced areas like time waiting on tools and the like.
  • Machine measurable. Thanks to modern tools, these metrics can be continuously monitored without manual intervention.
  • High-leverage. Excelling in these areas can unlock significant opportunities for improvement across the board.

These core metrics tend to generate a substantial backlog for most productivity enhancement efforts. You’ll identify more specific areas for attention and measurement over time.

Where are these numbers coming from?

DORA has come up with their own categorization, which is based on a survey question about the code base the respondents are mostly working on. In our experience, that gives a biased view.

Some others claim a scientific methodology, where they end up sorting a dataset, only to claim that the optimal coding time is less than 15 minutes. Sure enough, that might be the case for single-commit changes, but it’s generally not a helpful recommendation. Additionally, the type of work (embedded software vs. cloud backend) and various other factors affect the usefulness of the recommendation.

For our benchmarks, we’ve hand-picked the numbers based on conversations and metrics from thousands of companies, ranging from small startups to enterprises with tens of thousands of employees. They are rule-of-thumb estimates, and in our experience, surprisingly accurate. Yet, it’s good to acknowledge that all engineering metrics are highly context-dependent, and the benchmarks may not all be relevant in your unique situation.

Getting better

To provide clear-cut improvement guidance, we defined three levels of attainment for each of the benchmark metrics. These levels clarify where to concentrate your efforts for maximum impact and what can be set aside for the time being. Instead of adhering to strict percentile ranges, the focus is on identifying opportunities for driving step-change improvements.

Here’s what each level means in a bit more detail:

  • Great. You’re doing amazing, and you’re well-equipped to make deeper process, tooling, and experience improvements while carefully monitoring and sustaining your ongoing greatness.
  • Good. You’re doing just fine. Leave these alone until nothing “needs attention.” When that’s done, moving any one of these to “great” can materially change how engineers do their work.
  • Needs attention. You need a focused effort to move anything in this category toward “great.” Anything that falls into this category is holding you back from making more impactful improvements down the line.

The exact path to improvement will always depend on your organization’s particulars, but there are some common culprits if you have metrics that are on the “needs attention” end of things. Here are some ideas for improving each metric:

  • Engineering investment. Regularly schedule tech debt days. Use “cost of delay” analysis to prioritize work based on the impact of not addressing it, balancing immediate feature work with long-term necessities.
  • Flow efficiency. Focus on identifying and removing bottlenecks within the development process. Implement work-in-progress (WIP) limits to prevent team overload and ensure tasks are completed sequentially, rather than starting new ones prematurely. Streamline communication channels. Visualize workflows using Kanban boards or similar tools to provide visibility into task progress and pinpoint where bottlenecks occur. (See our additional guidance on improving flow efficiency.)
  • Batch size. Adopt continuous integration/continuous deployment (CI/CD) practices, allowing for smaller, more frequent releases. Implement feature flags to decouple deployment from release to enable smaller updates to go live without being immediately visible to users.
  • Lead time to change. Ensure teams can access tools that streamline communication and task management. Refine prioritization processes to promote agility and ensure work is properly prioritized and resources are efficiently allocated.
  • Time to deploy. Eliminate unnecessary CI/CD steps or bottlenecks. Invest in infrastructure as code to quickly provision and manage build and deployment infrastructure.

Implementing these strategies requires focusing on the technical and cultural aspects of the engineering team and its processes. The tools and insights in Swarmia can support teams in identifying areas for improvement and tracking progress over time, aligning with the goal of continuous improvement.

Now what?

Some of the benchmarks might seem absolutely unattainable to you today, and that’s OK. Swarmia gives you the tools to discover opportunities, implement changes, and see the results as they happen.

As a first step, gathering metrics for your team will help you identify the most critical bottlenecks in your process. Once you have your baseline metrics, Swarmia helps you set achievable goals for improvement based on your unique situation.

How is your organization doing?
Discover the most pressing areas for improvement for your teams with our software development benchmarks.
View the benchmarks
Rebecca Murphey
Rebecca Murphey helps Swarmia customers navigate people, process, and technology challenges in pursuit of building an effective engineering organization.

Subscribe to our newsletter
Get the latest product updates and #goodreads delivered to your inbox once a month.