Once you have more than a couple of teams, you already have an internal developer platform, whether you know it or not. It exists as bash scripts, troubleshooting docs, the three slightly different ways to set up your dev environment depending on which team you’re on, the two different versions of Node that you somehow ended up running. Evidence of both the platform’s existence and its fragility arises as the dreaded “Carla built that, you’ll have to ask her” Slack message, except Carla is on vacation for another week.
No team is on the hook when those widely-used bash scripts break, but worse, no one is thinking about how all the pieces work together, or whether they do at all. This is a proven recipe for a fractured, frustrating customer experience — you’d never leave your actual product’s users to navigate this sort of chaos.
In 2015, Peter Seibel wrote Let 1,000 Flowers Bloom, and to this day I think it’s one of the best summaries I’ve ever read of how, at a certain scale, delivery comes to a near-halt in the face of decisions that made a lot of sense back when they were made. It’s a rollicking tale of engineering adventure and misadventure at Twitter, long before <redacted> took the helm.
While the whole post is a great read, the thing that’s stuck with me the most is this:
Seibel proposes this as a useful model for thinking about the benefits a platform team can bring. The conclusion is that, using plausible values, a platform team focused on increasing the effectiveness of the non-platform engineers will start to meaningfully improve the effectiveness of the non-platform engineers around 100 engineers. As an eng org grows, it makes sense to invest more engineers in platform.
This model, with s, accounts for the fact that each platform hire will have slightly less impact than the last. ee is the size of the platform team, s is the relative effectiveness of each subsequent platform hire, and b is the impact a platform team can have on non-platform software engineers.
I’d suggest that the diminishing effectiveness exponent s should apply to the general population, too: the diminishing effectiveness of an engineer as an organization grows isn’t a problem unique to platform teams.
The fundamental challenge of a growing engineering org is to sustain or increase E vs. the total number of engineers in the org. You can play with the variables, but for all realistic scenarios, you eventually end up in a situation where, if you want each new engineer to have some meaningful marginal utility, you must have a platform team to offset the cost of growth itself.
So if you’re reading this article, and you’re coming up on four or more teams, I’d suggest the time for a platform team just might be now. A team will need time to learn about the problem space and identify opportunities, to figure out how to prioritize their work and represent it to leadership — and that learning can be hard to do when your hair is on fire.
The minimum viable platform team
Starting a team doesn’t have to be a big production: your team lead already works at the company and is looking for a new opportunity. They’re the self-directed, consistently high-impact person who’s been poking at flaky tests lately, and who exceeded expectations last half for automating the entire build and deploy. They’re a favorite collaborator among technical and non-technical folks alike, and they live for a good session of code archaeology.
The second engineer also works there, and they were clearly exceeding expectations within their first months. They’re a smart execution machine, in need of a good mentor. They’re interested in both humans and computers, hungry for hard problems, and they don’t mind if people out in the real world don’t see their work.
The engineering manager probably isn’t managing this full-time at first; ideally, their role will be to guide the discovery of potential problems, assist in prioritization, and represent the results to leadership. If they’re dealing with delivery or collaboration issues, you probably picked the wrong engineers.
You’re going to need a product manager sometime — I promise — but for now, if you keep two pieces of product mindset in mind, let them be this:
- Internal developer platforms are products, and products have users: you need to talk to them and listen to them (especially when they’re frustrated). Your ultimate goal is to help them produce more value for the same amount of effort.
- Internal developer platforms are products, and products have metrics: you need to know how your product is working (and failing). Your ultimate goal is to understand and make more efficient the workflows of your software engineer users.
What if it’s too late?
Shannon, a product leader I used to work with, would often say, “You know, trees.” It was her shorthand for the axiom that the best time to plant a tree is 20 years ago, and the second-best time is today.
No matter where you are on a platform journey, the first steps will tend to look the same: get a couple-few people focused on the problem, give them good guidance on what a “good” opportunity looks like, and give them a bit of time to pick the first thing they’ll do.
The problem becomes one of prioritization: among a whole traffic jam of opportunities, you’ll need to identify and act on the handful of changes most likely to get work moving again. A platform-savvy PM may be more handy now than if you were kicking this off back when you had just a few dozen engineers to support.
If you’ve reached the 100-engineer milestone without a platform team, you may want to set two or more pairs of engineers out on the path explained above. If you do that, you’ll almost certainly want a product-minded eng manager dedicated to the effort.
Time passes, scale happens: It’s a long game
The fundamental mission of a platform team is straightforward: ensure that the value provided by the platform substantially exceeds the costs of implementing it.
As long as you are still building features or adding customers, a default side effect is that microservices will sprawl, monoliths will become increasingly entangled, and the cost of gaining the context needed to work on any particular feature will exponentially increase relative to the complexity of the task. Debt, as it has a tendency to do, will accumulate non-linearly over time.
A good platform team solves this by bringing vision, alignment, and data where it didn’t exist before.
This works on the other side, too: Many improvements to developer experience continue to pay dividends over the course of years, long after they transitioned from a powerful new tool to the default way of doing things. How do we account for that?
A smart platform team will find ways to claim the impact over the likely survival time of the underlying system.
If you keep at it long enough, the actual needs of the business will change. In years of growth, successful onboarding will be a focus; in lean years, the focus will be on eliminating waste — waste of money, but also waste of time.
A successful platform team has a long-term strategy, but keeps its finger in the wind to connect business priorities with its fundamental mission.