You’ve probably heard of You Build It You Run It. It’s an operating model that empowers product teams to own every aspect of digital service management. When done well, it accelerates your time to market, increases your service reliability, and grows a learning culture.
However, there are some pitfalls. These can drain the confidence of your senior leadership, and ultimately put the success of You Build It You Run It at risk.
In our recent You Build It You Run It playbook, my co-author Steve Smith and I take a deeper look at these pitfalls, and recommend how to avoid them. We believe you can proactively guard against them, and quickly escape if they do occur.
In no particular order, the pitfalls are:
- Linear run cost
- Responsible but unaccountable
- Excessive BAU
- Change management treacle
- No major incident management
- Embedded specialists in product teams
- Limited on-call schedule
Linear Run Cost
If you propose a move from a central operations team to You Build It You Run It, your senior leadership will no doubt worry about a linear run cost at scale. It’s an easy trap to fall into. With central operations, you pay a daily rate for one analyst to be on-call for five digital services. So five on-call product team members for five digital services doesn’t seem to make financial sense, does it? And what if you have 10 teams and 20 digital services, or more? You can see the concern.
To avoid this trap you need to:
- Stop comparing run costs in isolation. You Build It You Run It is a multi-cost insurance policy, well suited to digital business outcomes. It has much lower opportunity costs than a central operations team.
- Mitigate the risk of a high run cost. A linear run cost can be avoided by selecting different out-of-hours policies for digital services, based on their different financial exposures.
Responsible but unaccountable
With You Build It You Run It, it’s possible to make your product teams responsible for the reliability of their own digital services, but leave your Head of Operations accountable for overall reliability. This is a serious trap, because it discourages product teams from prioritising operational features, and encourages them to cut corners when they’re not held to account for failures.
To avoid this trap you need to:
- Make your product team budget holders accountable for business outcomes, and fund on-call costs from product team budgets. That way, your product teams will have the right operability incentives, and they’ll constantly balance operational concerns with product demand.
Excessive business as usual (BAU)
Managing a live service can involve a lot of daily BAU maintenance tasks. It can include checking dashboards, running daily jobs, and fixing intermittent alerts. One or more of your product teams could struggle to deliver planned features on time, because most of their time is spent on BAU work. And this is often an invisible trap, because unplanned work isn’t usually tracked closely.
To avoid this trap you need to:
- Eliminate as many sources of maintenance work as possible. This includes re-architecting digital services for adaptability, creating a fully automated deployment pipeline, and establishing an automated telemetry toolchain
- Track any unplanned tasks that’ll take longer than a day in your ticketing system. Then you can manage and prioritise BAU work effectively.
Change management treacle
You Build It You Run It requires you to overhaul your change approval process. You can’t accelerate your time to market and meet customer demand if each change request takes hours, or even days to be approved. The good news is that You Build It You Run It is 100% compatible with IT management frameworks such as ITSM and ITIL.
To avoid this trap you need to:
- Pre-approve change requests for low risk, repeatable changes.
- Automate change auditing with your change management team, to ensure their compliance needs are met.
No major incident management
Moving incident response from one central operations team to many product teams can dilute the role of major incident management. That’s a mistake because bridging between teams and managing senior leadership are vital skills in multi-service major incidents.
To avoid this trap you need to:
- Integrate product teams into your incident management process as is.
- Automate manual tasks where necessary.
- Ensure your incident managers are trained to manage incidents for your digital services. as well as your foundational systems.
Embedded specialists in product teams
If you’ve got a small, central team of specialists struggling with demand – DBAs perhaps, or operability engineers – it’s a mistake to move them into embedded roles within your product teams. Your specialists will miss working together, their fluctuating workloads will cause burnout or boredom, and recruitment will be tough as well.
To avoid this trap you need to:
- Push repeatable, low value specialist work onto a cloud provider.
- Automate repeatable, high value work as self-service pipelines, which product teams can use themselves.
- Turn your small, central team into specialists as a service that offer ad hoc expertise where they’re needed most.
Limited on-call schedule
Your product teams need to establish their own on-call schedules out of hours. Some product team members will opt out of on-call, and you may struggle to form a sustainable rota as a result. They might rely on one or two people to do more on-call out of hours than is healthy for them.
To avoid this trap you need to:
- Tackle the reasons why product team members don’t want to do on-call.
- Prepare people from day one of building a digital service.
- Strive for nobody to do more than one week on-call per month.
- Respect the different personal circumstances of different team members.
- And most importantly of all, pay people to do on-call standby out of hours. Ensure people feel compensated for the disruption to their personal lives.
To find out more, you can continue our You Build It You Run It pitfalls series:
- 7 pitfalls to avoid with You Build It You Run It – you are here!
- 5 ways to minimise your run costs with You Build It You Run It
- Why your head of operations shouldn’t be accountable for digital reliability
- How to manage BAU in product teams
- 4 ways to remove the treacle in change management
- Why product teams still need major incident management
- Stop trying to embed specialists in every product team
- How to avoid developer burnout on call
Our You Build It You Run It page has loads of resources on on-call product teams – case studies, conference talks, in-depth articles, and more. Plus our You Build It You Run It playbook gives you a deep dive into how to make it happen! Get in touch, and let us know what you think.