Last month, I had the great honour of being invited to speak at Austrade’s Data Champions conference, a quarterly convention for the Federal Government’s Australian Trade and Investment Commission.
Serving a wide range of stakeholders—both at home and internationally—the Australian Trade and Investment Commission (also known as Austrade) delivers services to grow Australia’s economic prosperity. It’s all about helping businesses go further, faster. Obviously, data plays a huge role in their approach to high quality service provision.
To support the team in their excellent work, I was happy to deliver a brief presentation on the importance and value of data pipelines.
This included some of the new ways we think about, and work with, data at Equal Experts. Plus—given Austrade’s focus on ‘connecting Australian businesses with the world’—some of our learnings from embedding leading data practice through our ongoing collaborations with Her Majesty’s Revenue & Customs (HMRC); the United Kingdom’s equivalent of the Australian Taxation Office (ATO). Over a working partnership of many years, we’re proud to note that Equal Experts is one of the top five resource providers for HMRC. We’ve worked together on everything from cutting-edge fraud detection mechanisms to transitioning from physical infrastructure to the cloud.
In contrast, this event was singularly focused and all about data.
Here are some of the key points that seemed to resonate throughout the presentation.
1. Data can—and should—be agile.
Historically, and in many large-scale organisations around the world today, there’s a tendency to conceptualise and treat data in a certain way. In many instances, it’s a case of approaching data without any sense of fluidity or immediacy.
In the old world, data isn’t something to be dynamically accessed. It’s built up over time and then used to create reports retrospectively based on insights gleaned from the material. Additionally, there’s a prevailing conception that data is slow to establish; you need to build up expanses of information before any meaningful implementation.
In fact, the opposite is true. With the right approach, you can act much more fluidly and create real business value in real-time.
For example, we design and implement data pipelines with highly reusable patterns; this ensures organisations can rapidly create new data pipelines as use-cases or business requirements evolve. And they do, and should, evolve.
If something changes in your organisation—the source of data, the use case associated with the data being collected, the utility of how or why that data is important—then you need to evolve your pipeline(s) to reflect those developments. This is where the concept of agile data practice comes to the fore.
There’s a valuable practice of continually reviewing data pipelines, which many organisations fail to realise. Data collection and collation isn’t a set-and-forget proposition, unless your organisation itself is comfortable in stasis.
We typically approach data practice through the lens of agile delivery, with practices and rituals such as:
- Discovery and stakeholder engagement: Collect the necessary context for any data set by incorporating business drivers, a range of data sources, your current or desired capabilities, and the reality of your IT systems. This de-risks delivery by ensuring you have everything in place to hit the ground running: clear scope, visible dependencies, defined ways of working, and a delivery plan.
- Iterative delivery: Short sprints with continuous feedback help deliver value rapidly and frequently throughout the process. Competitors talk about data in the context of delivering value by the 6-month mark; we prefer to deliver cyclical value every 3-4 weeks.
- Continuous review: Regular review intervals enable stakeholders to continuously validate progress and decide when, and how, to release end users.
- Launch and refine: Collect feedback and refine things using a data-driven approach.
2. Keep unstructured raw inputs separate from any processed data streams.
In terms of prioritising and maintaining a level of flexibility, it’s in your best interest to keep unstructured raw inputs separate from any processed data streams. This ensures you minimise the requirement to develop new end-to-end pipelines for new use cases. You simply draw data from the existing unstructured data for specific requirements as they become apparent.
By following this practice, you can keep your data flexible, agile, and easy-to-update. Which, in turn, facilitates far more value—often in real-time—from the information you cultivate.
3. Build your technical infrastructure around your business infrastructure: start with the use-case.
This relates to another crucial practice outlined in our Data Pipelines Playbook. It’s essential that you think of data pipelines as products, not projects. This means your data pipeline should have a product owner: someone who can prioritise deliverables and assist in defining use cases.
These use cases are critical. Effective data practice always starts with the use-case, rather than the technical implementation. A technical architecture must be driven by a business architecture, which should include the actual environment of the organisation in question.
You simply cannot define a high performance, highly effective technical architecture without that fundamental context of business or organisational requirements. And business requirements are often defined by the use-cases of the end users of the system, and the data that it generates.
Without detailed understanding of those use-cases, how do you calibrate and measure the efficacy of your solution?
If you’re ready to embed leading data practice at the core of your organisation, let’s tee up a conversation.
Alternatively, take a look through some of our other pieces on data pipelines: