Subscribe to the Supply Chain Planning Blog

Keep up with the latest trends, research, and insights about supply chain planning, demand forecasting and inventory optimization.


Expensive Data Science Mistakes: When Your KPIs Lie

By Fabrizio Fantini • 8 Sep 2020

Business Science must be Autonomous, not just science

When KPIs lie

Was it there something we could do together to improve sales, the beauty company’s Russian country manager asked my business partner Darya?

We quickly launched a demo pilot changing supply chain delivery algorithms into two locations and the results were excellent, in just few days we could improve sales by more than 10% — in fact unprecedented in my experience.

Now, I am a firm believer of management by numbers, but this is a cautionary tale of the potentially insidious and costly pitfalls of data-driven processes.

In this particular example, the company KPIs were being measured particularly ineffectively, resulting in an interesting exchange — which cast some light into potentially expensive data science mistakes.

In fact, rather than being happy to celebrate such outstanding impact, the first reaction of the supply chain manager from this company was:

These results are not possible! My KPIs did not improve, so the results must be wrong.


At which point I asked what the KPIs were? An example:

Now the manager was a smart person in good faith.

But I started suspecting that such spectacular and quick results might be partly enabled by a fundamental blind spot in the company management practice.

Was this company really 97% percent available on-shelf?

So I went to check out a few locations and took pictures of shelves first-hand, some examples:

These were in stark contrast with the numbers as painted by the KPI dashboard, and therefore as viewed by management.

Management truly believed in their reported 97% availability, did not see any problem with this specific KPI, and therefore ultimately did not believe (any) sales improvement was possible through supply chain demand forecasting.

Since the KPI as previously measured did not actually improve, how can an apparently related sales performance improve: a common fallacy of data-driven management, using the wrong data to answer the right question (or vice-versa, in other cases).

To understand how to avoid any such common issues, why do they occur in the first place?

The problem: new leadership skills to effectively manage by numbers

It is actually not easy to manage by numbers, even in spite of all the data-driven frenzy around machine learning and artificial intelligence. In fact, the more data, the harder it becomes.

How, When and Why do you measure KPIs?

  • How to measure KPIs: it is not easy to measure shelf availability systematically, even just having a clear picture of how much inventory is in a certain location is often only x% accurate; even then, is the product actually displayed on the shelf, or sitting in the backroom storage?
  • When to measure KPIs: how frequently should a measurement be taken, is it better to have frequent (weekly, in this example) reporting which is clearly inaccurate, or less frequent (monthly, or quarterly) with a more thorough and potentially expensive, but more accurate, process?
  • Why to measure KPIs: as the king of all questions, any managerial decision should stem from the impact a KPI is supposed to have, for example it is very different to ask whether a product should be available in a store, or whether it’s actually physically present there, and then whether it is displayed or not. All legitimate questions, but why exactly should a manager ask them, in other words, what impact should the KPI lead to?

So I asked management why were they measuring the KPI and why in this particular way, and the answer was…

Our measurement is done according to the corporate standard

(supply chain manager)

A perfectly legitimate answer, almost by the book of box-ticking, which is the reality of large companies forcing otherwise smart and very talented managers to spend their days dealing with non-value-adding activities.

The solution: driving value with Business Science

I have argued before that Data Science is dead.

 Data Science is Dead. Long Live Business Science!


This tale is one of the examples why!

Measuring, analysing and even reporting numbers may not be good enough, unless starting from the reason why.

I am proud of having demonstrated the value of data in this experiment, but I fundamentally believe that the vast majority of it derived from shifting the management perspective.

← From the question: how to improve forecast accuracy

→ To the question: how to improve sales even when numbers are inaccurate

Shifting to a value-first question is the key to Business Science

It actually does not matter much what the forecast accuracy is, if the objective function is not directly value creation. For example, inaccurate KPI measurement may lead to a distortion in the perception of impact.

Addressing the 3 sources of management error

To fill demand with the best possible predictive accuracy and business impact, an autonomous data science model must address all three sources of error simultaneously:

1. Usage error, caused by using the right model for the wrong purpose i.e. even when assuming an otherwise perfect model. For example, measuring the wrong KPI

2. Input error, caused by the right model but fed the wrong input i.e. even when assuming zero residual error. For example, a wrong KPI measurement

3. Residual error, caused by using the wrong model with systematic challenges i.e. even when assuming perfect model input. For example, failing to improve the right KPI

Examples of each type of error:

  • Usage error: for example, using the company budget to drive supply chain decisions, or asking the salesforce for their expected projections and then using these data points to forecast sourcing and inventory purchasing; these are common use cases, where the right model is used for the wrong purpose, driving systematically inaccurate supply chain decisions
  • Input error: for example, running a promotion which was not planned; or experiencing a macro-economic shock like a crisis or pandemic, which was not expected; even a perfect model with zero model error generates incorrect predictions given the wrong input
  • Residual error: for example, predicting X and observing Y ≠ X; typically considered the ‘main’ type of error, even if typically accounting for a small fraction of the overall deviation from the optimal solution!

Overcoming the limitations of data: managing autonomous systems

Each source of error must, crucially, be addressed at the same time:

  • Usage error: tailoring a different forecast for each specific use case
  • Input error: autonomous test-and-learn to adjust for the unforeseen
  • Residual error: meta-model to select the best combination of models.

 To be effective, Business Science must be designed to be Autonomous, not just science.

The most effective mitigation against input error is self-learning; since sometimes prediction errors can be profitable, the learning objective must be profit, not accuracy!

For example, poor performance as expected would lead to zero error measurement but negative EBITDA impact — clearly, a different definition of optimal is needed.

Autonomous systems mitigate usage and input errors, typically the most significant sources of under-performance.

Effective management by numbers requires targeting profit improvement directly

Demand modelling requires complex approaches, like meta-modelling, which can be partially black-box: the nature of the problem is intrinsically challenging, as purchasing decisions are not entirely rational nor explainable.

However, the supply model is always entirely explainable.

Optimal supply requires defining the strategic objective, like for example optimal economic profit net of the cost of unsold goods; as well as the set of constraints and rules that must be respected to fit within the desired strategy.

In short, a new human-machine alliance

On the one hand, advanced demand modelling on auto-pilot; on the other hand, classic and fully-explainable supply-side models to guarantee optimal outcomes at all times.

This works because it addresses at the same time a number of concerns:

  • Senior managers achieve sales and profit improvements which are easier to read and understand, than technical forecasting considerations
  • Hands-on managers remain in control of strategy, objectives, constraints and therefore continue to set the rules of the game as they always did; for example, min/max inventory per product
  • Customer purchase decisions, rather than potentially wrong KPI data or management belief, become the driving force of self-learning, therefore any type of error (even dirty data errors!) will automatically self-correct.

Management by numbers, redefined: the autonomous revolution.

Happy KPIs!

PS more Business Science:

 How Business Science Revealed Hidden Pricing Opportunities During Covid-19

 The machine vs. human debate


To get Business Science software, University-level learning (launching October 2020), and a monthly summary of insights: 

Free registration

Any questions? Please connect on Linkedin

Subscribe to the Supply Chain Planning Blog

Keep up with the latest trends, research, and insights about supply chain planning, demand forecasting and inventory optimization.

Supply Chain Brief