How to scale teams without losing stability and speed?

One of the challenges that I’ve seen repeatedly is trying to scale engineering teams without having a solid base for growth and by solid base, I mean all kinds of different aspects related to the typical SDLC, from processes to technology, and I’m not even considering all the challenges regarding onboarding new people and the impact on the organisation’s culture.

The typical example we all went through is the web application that once started by 1 team and months later, it’s a monolith with 5 full teams working on it. It gets to a point that instead of great speed and agility you have slower paced teams, battling to minimize the number of things they break with each new release. This typically happens because of the growing complexity and because teams start to become more and more dependent on each other. They’re working on the same code repository and, on top of each code repo, there’s CI/CD processes and tests (unit, integration, e2e, …), the larger the codebase is, the bigger the complexity and the bigger the challenge of ensuring alignment between them and therefore, slower pace and higher cost to make changes. Some companies try to tackle the instability with more processes but it decreases (even more) the pace of delivery and, very important, increases frustration within the teams and ultimately decreases motivation and lack of trust starts to arise — engineering teams, at their heart, like to be fast, to make things happen and to deliver value and see quick results. Once you get to this point, it’s a vicious cycle, demotivated teams lead to an even slower pace and nothing gets done on time.

So, what’s the solution?

One of the most important things is to ensure that, each time you add teams to the organisation, they are mostly independent. Every team has its own pace, you don’t want one team impacting another and to enable an almost linear capacity growth — more teams, more throughput — you need to promote some level of independence between them, at least in their related processes, processes like CI/CD. If the team A makes daily/weekly releases on the same piece of software as Team B, chances are they will be impacted by one another, and that starts right by sharing the same code repository and attached CI/CD processes. This independence can be triggered right at the code repository level, each team should “own” their piece of code, meaning that, although supporting internal “open-source” (or even public open-source) contributions, it should be mostly independent, from adding new code way up to the release process.

But designing a product or solution with this in mind from the ground up can be a lot over-engineering, especially if we’re talking about a start-up with limited resources, where typically the monolith option fits the required initial speed — this should probably be addressed once you have proved the market fit, where, most likely, by that time you’ll have a monolith with a considerable size.

How to tackle the monolith and enable the conditions to scale teams?

It’s easy to find some known examples (SoundCloud, Amazon) and, if you have a few years of experience as a software developer, chances are you’ve probably faced the typical monolith challenge — early startup team ships an MVP successfully and then starts to iterate on it (nothing wrong with this approach, on the contrary!) and then the MVP turns into a monolith. For the sake of the example, let’s assume it’s an e-commerce backend application, feeding both, web and mobile apps through APIs. There are multiple solutions, depending on the end goal, but one typical solution is to follow Domain-driven Design and to start breaking the monolith into independent domains with specific vertical business capabilities.

Picking the first capabilities to decouple is very important and it should be aligned with your strategy and goals — start by less impacting features or core services — ultimately the option should be to decouple what is important to the business and what changes frequently. For example, splitting the Product Catalog or the Checkout into separated services from the e-commerce monolith might be a long and challenging task and it might be wiser to warm up with a simple and fairly decoupled capability like the User Preferences or the Authentication capabilities, especially if the teams are inexperienced with microservices. Ideally, each migration should be an atomic evolutionary step taking the architecture closer to its target state.

Finally, one very important thing is to ensure the organization provides a structure of autonomous long-standing teams and consider DevOps as a way of working, this way each team will own the process end-to-end (from building the service to run it in production) and will avoid getting stuck in the middle of any transition.

But does it comes with a cost?

There’s no such thing as a silver bullet and enabling the conditions to scale teams will provide you with some benefits but you still have to ensure overall alignment between all teams, although they’re loosely coupled. If you don’t, you might end up with teams going for different directions in terms of product vision.

Now, do you consider Speed a competitive advantage for your business? If yes, you should decide which of this you’ll pick: Large codebases or Scaling teams.

Other relevant resources:

Curious about many subjects. You can find me also at https://www.linkedin.com/in/marcioazevedo/.