Lyndsay Prewer

lprewer

Agile Delivery Consultant
Tech Focus

September 6, 2022

Four important reasons to hold a Chaos Day in 2022

As technology professionals, stability is the gold standard. But once in a while, it is important to create chaos.

Chaos Day is an opportunity to carry out carefully planned experiments that introduce errors and turbulent conditions in IT systems, such as terminating a compute instance or filling up a storage device. It’s a useful exercise for any organisation that wants to understand the impact and system response, and then use that understanding to improve reliability and resilience.

Wondering whether you could benefit from a Chaos Day? Here are six reasons why your organisation should run a Chaos Day this year:

1: To prepare for the unexpected

The main benefit of a Chaos Day is that it helps the organisation prepare for inevitable failures and unexpected events, before they actually do occur. In today’s complex IT environments, turbulence will happen due to  single point failures or multiple, unrelated failures – often combined with sudden changes in external pressure, such as traffic spikes or security threats.

2: To analyse how your team prepares for and responds to problems

Carrying out a Chaos Day allows you to view, analyse and improve how your team responds to unexpected, turbulent conditions. It can provide a safe way to identify gaps in your teams’ skills around collaborating, communicating and thinking during high-stress periods.

3: To improve skills and knowledge across your IT team

During a Chaos Day, you can expect your team to gain:

  • New knowledge about system behaviour
  • Expertise in diagnosing and resolving incidents
  • Better skills around collaboration and communication
  • Greater understanding of system failures and recovery

Teams will also share knowledge while working through problems on a Chaos Day, meaning each team has a better understanding of their colleagues’ knowledge and skills. Chaos Days can also improve technical knowledge, which can be used to make changes that boost resilience. For example, a chaos day can illustrate the usefulness of new features such as retry mechanisms and circuit breakers.

4: To build resilience

The ultimate goal of a Chaos Day is to build resilience through greater understanding of system behaviour and failure scenarios when tackling production incidents or developing system enhancements.

Chaos Days improve system resilience by improving:

  • The skills, knowledge and understanding of your team
  • Processes, by guiding improvements in incident management, analysis and engineering
  • Products, by initiating changes that make services more resilient, and by improving documentation such as error messages and runbooks

For more insight into the benefits of running a Chaos Day, along with expert guidance on how and when to organise Chaos Days for the maximum benefit, check out the playbook online here.

You may also like

Blog

How to prepare for a Chaos Day

Blog

How to run a Chaos Day: 4 vital steps

Blog

How to capture learning from a Chaos Day

Get in touch

Solving a complex business problem? You need experts by your side.

All business models have their pros and cons. But, when you consider the type of problems we help our clients to solve at Equal Experts, it’s worth thinking about the level of experience and the best consultancy approach to solve them.

 

If you’d like to find out more about working with us – get in touch. We’d love to hear from you.