What Is a Data Product Owner? Responsibilities, Challenges and Best Practices

Thorben Louw

Data/Machine Learning Engineer
Data

January 21, 2025

What is a Data Product Owner? Responsibilities, challenges and best practices

How might a Product Owner who suddenly finds themselves owning a “data product” translate their existing product skills to data?

 

Many organisations now prefer product-based development (as opposed to project-based development) as a way to structure and deliver initiatives within their organisations. This usually means:

  • Having a long-term vision and an ongoing lifecycle for a product, rather than short-lived, tightly-scoped projects
  • A strong sense of ownership (i.e. a “Data Product Owner”) and ongoing business value that the product delivers
  • A user-centric focus that informs the prioritised iterative delivery of features, as opposed to having a sense of everything that must happen for a project to be “done”
  • Constant measurement of the product’s value, and
  • Organising people as a cross-functional team focused on the product, which maximises the communication and breaks down traditional barriers to delivery which arise when teams are structured around their job role (e.g. “the IT data team”) rather than the product.

While the product approach is now common for building services and apps, the data world has been slower to adopt “data products”. Usually, there’s even debate about what exactly the data products are in a client’s organisation.

Designating people as “Data Product Owners” without defining quite what that means can lead to frustration, a lack of real change, and not making good on the promises of product-based delivery for data.

Thinking about data (e.g. datasets or APIs) and insights (reports, dashboards, ML models) as end-user focused products, rather than a by-product of business operations or technical assets, has really taken off in the last 5-8 years. Reframing data assets as these “data products” means:

  • Having clear ownership (traditionally a huge headache for a data in enterprises), often managed by a data product owner
  • Being clear about who the users of data are and what their frustrations and feature requests might be
  • Structuring teams and funding so that data products are iteratively and continuous developed rather than in uncoordinated fits and starts over many small projects
  • Being able to quantify, communicate and influence the quality of data for consumers
  • Data product thinking is especially associated with the Data Mesh paradigm (but you don’t need to be “doing data mesh” to benefit from thinking about data-as-a-product or Data Products.)

❗Warning: We’ve seen some organisations try to extend the definition of ‘data products’ to the platform services that support building good data products (e.g. catalogs, lineage) or the infrastructure used to “host” data products (databases, data lakes, reporting tools), but this just leads to confusion. We recommend you establish some organisational clarity about what a data product is. Snowflake, for example, isn’t a data product (and neither is TensorFlow, Airflow, PowerBI, Azure Data Factory). Those are all examples of things we use to build and serve an organisation’s data products. Your data lake, data catalog, data integration tool? Those aren’t data products either!

Types of data products

Source data products

These encapsulate the integration of a source system’s data into the analytics platform and make the data available for other downstream products to use to build “derived data products” (like reports or ML models). Source data products typically take responsibility for owning the data quality, freshness, resilience of pipelines and cleansing/duplication of data from the source system. They make available metrics to help their consumers understand data quality. Source data products must be receptive to how data is used, and roadmap requests might involve: increasing the scope of data (more tables from the source system); supporting a variety of velocity needs (streaming vs batch); and allowing data to be consumed in different ways (maybe via APIs or governed tables in a lakehouse).

Reports and interactive data dashboards

These are business tools to visualise and understand business metrics and deliver summarised insights. These might include the data pipelines which build the data models the reports depend on, and reports in tools such as PowerBI or Tableau, or visualisations in “data science apps” build in frameworks like Streamlit.

Data APIs

Data APIs provide real-time access to datasets or processed information. They might expose data in the form of RESTful, GraphQL or stream APIs.

Feature Datasets

These are curated datasets for model training, usually feeding a variety of downstream ML Model products.

Semantic Models

These are curated datasets and measures for generalised reporting purposes, which can provide a consistent definition of important business measures and metrics and are used from a variety of reports.

ML Models

ML models (e.g. fraud detection models, personalisation models for marketing, recommenders) are for prediction or classification. It’s particularly important to treat these as never-finished products with an ongoing lifecycle because their usefulness and impact is so vulnerable to data drift and retraining.

The burden of being able to articulate data concepts

In traditional product development, we often start with a relatable “product” (for example, a website or app) where non-technical users can reason about the product and its value without needing to know too much about how it was built. It’s easier to get people excited about things they can see and use.

Unfortunately, for data products, this is rarely the case. You’ll need a certain amount of familiarity with data engineering, data science, and analytics concepts (e.g., ETL processes, machine learning, data modelling, statistics concepts), and you’ll probably face a bigger communication burden in translating these (somewhat dry) details into concepts that are relatable for business stakeholders.

Those building data products (and especially ML products) in organisations where these concepts are new, must acknowledge and accept that educating stakeholders about the challenges and differences for these products is part of the deal. Stakeholders often struggle to understand “why it takes so long to build a simple report?!” That’s because they can see the report, and so care about it, but don’t care about the work that goes into making the reliable data pipelines which feed the report. Often nobody has explained to them the scale of the work behind the scenes.

For AI and ML products this is even harder. Business-level stakeholders see impressive demos but struggle to understand the processes involved in running ML products and why they cost so much, or why products lose their effectiveness over time without constant attention. The unpredictability of consistently achieving good results looks like product team failure rather than limitations of complex models or availability of training data.

This can be a real challenge as a product owner if you can’t clearly articulate data concepts yourself. I recommend working really closely with your product team’s engineers and data scientists, and being genuinely curious about the details of how things are built. Ask your team to explain technical concepts simply or point you to resources that do this without getting bogged down in maths or minutiae.

Build the sort of team relationships where everyone agrees that getting clear messages across to stakeholders is a vital outcome and a shared responsibility – that’s just as important as building the product.

The role of a Data Product Owner

As a Data Product Owner (sometimes called a Data Product Manager), you are ultimately accountable for maximising the value of a data product in an organisation.

The core things to give careful attention to are:

  • Communication – “selling” your product to potential new consumers in the organisation , being a champion and promoting adoption. Help stakeholders understand progress and roadmap priorities and manage expectations. Arrange showcases tailored to a variety of audiences (data products can be a little dry, so you need to add the excitement).
  • Helping define requirements – You’re probably working with a BA, so this is a shared responsibility. Write clear, simple specifications based on business needs, work with a team to translate into actionable user stories/specs for data pipelines, dashboards, ML models etc.
  • Maintaining the roadmap and prioritising features – Communicating what is being built next, or in the future, understanding the needs of your consumers and prioritising them. Regularly review as expectations and business priorities shift.
  • Measuring your product’s impact – It’s the product owner’s job to define good success metrics (KPIs), make sure you’re measuring them, and report on them. Measuring engagement and usage of your data product falls into this category too. It can be really difficult to quantify a monetary benefit for data products, since they tend to be upstream of other kinds of products which claim the benefit. Often data products don’t have a quantifiable impact but are still critical for the business (e.g. reports).
  • Cost control – Understand and report on what your data product costs to run on your data platform, and prioritise appropriate cost optimisations and trade-offs. Customers are often horrified at what data pipelines cost (and the impact of insisting on “real-time” reports that nobody looks at). Ensure that your product delivers more value than it costs to run. You work with your data engineers to build cost measures into the product.
  • Data product quality – Data products are usually at the mercy of upstream source systems that we don’t own, but we can still measure and influence data quality, and manage those relationships with source system owners. Importantly, we need to communicate data quality in our product to our users, especially if there are issues. Make sure data products include visualisations of quality metrics and that these are within SLAs (data freshness, timeliness, data quality tests, data contracts)
  • Stakeholder management – Understanding stakeholder’s needs managing expectations communicating (releases, bugs, operational incidents, roadmap sessions)
  • Defining the vision and strategy for your product – Making sure this aligns with the organisation’s broader objectives
  • Compliancy with your organisation’s data standards and policies. These might include:
    • Retention policies
    • GDPR and other data privacy and legal compliance
    • Data ethics
    • Integration with catalog

Being a data product owner is both a challenge and an opportunity to bridge the gap between technical complexity and business value. By understanding the nuances of data products, from source systems to ML models, and building strong communication skills, you can ensure your data products deliver meaningful outcomes. Embracing collaboration, prioritising stakeholder education, and maintaining a user-centric mindset are key to thriving in this role. As organisations increasingly focus on data-driven decision-making, the data product owner is positioned to become a vital driver of innovation and business success.

You may also like

Blog

Introducing: The Equal Experts new Data Products Playbook

Blog

Are your microservices hiding data products?

Blog

From software engineering to data engineering with … Thorben Louw

Get in touch

Solving a complex business problem? You need experts by your side.

All business models have their pros and cons. But, when you consider the type of problems we help our clients to solve at Equal Experts, it’s worth thinking about the level of experience and the best consultancy approach to solve them.

 

If you’d like to find out more about working with us – get in touch. We’d love to hear from you.