Tech Focus Tue 10th May, 2022
5 Major MLOps Challenges – and how to solve them
If you’re struggling to realise the full potential of machine learning in your organisation, the good news is that you’re not alone. According to industry analysts VentureBeat, 87% of AI projects will never make it into production.
MLOps emerged to address this widespread challenge. By blending AI and DevOps practices, MLOps promised smooth, scalable development of ML applications.
The bad news is that MLOps isn’t an immediate fix for all AI projects. Operationalsing any AI or machine learning solution will present its own challenges, which must be addressed to realise the potential these technologies offer. Below we’ve outlined five of the biggest MLOps challenges in 2022, and some guidance on solving these issues in your organisation.
You can read about these ideas in more detail in our new MLOps playbook, “Operationalising Machine Learning”, which provides comprehensive guidance for operations and AI teams in adopting best practice around MLOps.
Challenge 1: Lack of user engagement
Failing to help end users understand how a machine learning model works or what algorithm is providing an insight is a common pitfall. After all, this is a complex subject, requiring time and expertise to understand. If users don’t understand a model, they are less likely to trust it, and to engage with the insights it provides.
Organisations can avoid this problem by engaging with users early in the process, by asking what problem they need the model to solve. Demonstrate and explain model results to users regularly and allow users to provide feedback during iteration of the model. Later in the process, it may be helpful to allow end users to view monitoring/performance data so that you can build trust in new models. If end users trust ML models, they are likely to engage with them, and to feel a sense of ownership and involvement in that process.
Challenge 2: Relying on notebooks
Like many people we have a love/hate relationship with notebooks such as Jupyter. Notebooks can be invaluable when you are creating visualisations and pivoting between modelling approaches.
However, notebooks contain both code and outputs, along with important business and personal data, meaning it’s easy to inadvertently pass data to where it shouldn’t be. Notebooks don’t lend themselves easily to testing, and cells that can run out of order means that different results can be created by the same notebook based on the order that cells are run in.
In most cases, we recommend moving to standard modular code after creating an initial prototype, rather than using notebooks. This results in a model that is more testable and easier to move into production, with the added benefit of speeding up algorithm development.
Challenge 3: Poor security practice
There are a number of common security pitfalls in MLOps that should be avoided, and it’s important that organisations have appropriate practices in place to ensure secure development protocols.
For example, it’s surprisingly common for model endpoints and data pipelines to be publicly accessible, potentially exposing sensitive metadata to third parties. Endpoints must be secured to the same standard as any development to avoid cost management and security problems caused by uncontrolled access.
Challenge 4: Using Machine Learning inappropriately
Despite the hype, ML shouldn’t always be the default way to solve a problem. AI and ML are essentially tools that help to understand complex problems like natural language processing and machine vision.
Applying AI to real-world problems that aren’t like this is unnecessary, and leads to too much complexity, unpredictably and increased costs. You could build an AI model to predict whether a number is even or odd – but you shouldn’t.
When addressing a new problem, we advise businesses to try a non-ML solution first. In many cases, a simple, rule-based system will be sufficient.
Challenge 5: Forgetting the downstream application of a new model
Achieving ROI from machine learning requires the ML model to be integrated into business systems, with due attention to usability, security and performance.
This process becomes even longer if models are not technically compatible with business systems, or do not deliver the expected level of accuracy. These issues must be considered at the start of the ML process, to avoid delays and disappointment.
A common ML model might be used to predict ‘propensity to buy’ – identifying internet users who are likely to buy a product. If this downstream application isn’t considered when the model is built, there is no guarantee that the data output will be in a form that can be used by the business API. A great way to avoid this is by creating a walking skeleton or steel thread (see our Playbook for advice on how to do this).
Find out more about these challenges and more in our new Operationalising Machine Learning Playbook, which is available to read here.