5 challenges with data-driven decisions we don’t talk about much

4 min readApr 6, 2022

We develop sophisticated mechanisms to observe and capture reality, process that data in a variety of ways, and use it to improve decision making to help us achieve our goals.

Data-driven decisions can be highly beneficial in many cases, but should not replace the effort to gain a deep understanding of the market, customers, and business environment within the organization.

The following 5 reasons explain why a careful handling of data-driven decisions is appropiate.

Incorrectly set focus

Two possibilities alike can cause us to not even measure relevant data:

The wrong metrics are used to make the decision. Since the success case is rarely directly represented by a measurable product metric, it is possible that derivations are made incorrectly or incompletely. For example, in the B2B segment, the complex interactions and decision-making processes of clients, users, and clients’ customers can hardly be mapped. And therefore neither are the correlations between product and performance metrics.
The Simpson’s paradox occurs. This happens when events are split into several categories and the weighting of the associated classes are very different. For example, two target groups with different values may change their distribution over time. If they are not considered separately, the change in the distribution of the groups alone will change the overall metric without changing the situation for the target group in question. For example, if a segment with a weak margin increases strongly while the one with a good margin remains stable, the average margin will decrease.
Both variations illustrate that it is essential to understand the full context behind collected metrics in order to draw the right conclusion. Only then decisions should be made based on such data.

Incomplete or distorted baseline data

Correct logic does not yet provide correct data. Possible reasons:

Unintentionally incorrect technical implementations, resulting in not measuring what should be measured.
Variation: Originally correct, but unintentionally and unnoticed falsified measured values due to changes in “the system”.
Missing implementations, e.g., when certain sources such as native applications cannot be evaluated.
Regulatory or systemic limitation such as privacy regulations, cookie consents, technical barriers, etc.

The higher the logical and technical complexity in measurement, the higher the probability for errors in the system. It is therefore necessary for decision-makers to be able to understand at any time what exactly is being measured and how good the base of data is.

Volatility, periodization and single events

Periodic or singular exceptional situations can falsify the interpretation of the data:

Volatility: certain metrics may fluctuate fundamentally due to the business model and thus not compare well. For example, in the case of auctions or ticket sales, the availability of the supply side depends on market participants and does not have to be directly related to changes in the sales platform.
Periodization: this leads to changes in customer behavior or general conditions at certain times. Seasonal business, public holidays and vacation periods, Black Friday, etc. result in metrics changing without intervention for a certain period of time due to these special conditions before returning to normal values. In this case, a comparison with the previous period is more appropriate, insofar as the general conditions have not otherwise changed much.
Single events: like periodization, they lead to changed conditions, but without regularity and predictability. The Corona pandemic, for example, caused significant disruptions in some business models, which are likely to blend with the corporate response and thus allow few isolated conclusions about the effectiveness of the product changes per se.

These types of challenges can be mitigated to some extent by A/B testing that takes place at the same time with significant test groups.

Incorrect interpretation of data

Reading statistics is not easy for everyone. The following interpretation mistakes occur regularly:

Causality or cause-effect relationships are misinterpreted. A known or obvious cause is assumed, while in reality another cause is responsible for the measured effect.
Halo effect: A correct cause overshadows the fact that there are several or even many causes that contribute as much, but are overlooked.
Misreading the data. The use of statistical tools can easily lead to mistakes. A forgotten or incorrectly set filter is already enough to interpret results incorrectly.
Significance is not correctly assessed, leading to either incorrectly assumed or overlooked changes.

It is advisable to perform the analysis and interpretation in a diverse team with different expertise and with a four-eyes approach.

Accumulation of mistakes

Although one error pattern is enough to cause wrong conclusions, the mentioned challenges can also occur cumulatively. This lowers the probability of avoiding any mistakes from conception through interpretation.

We should therefore have a comprehensive picture of the context and available data, including how it is collected. We should check for the presence of hidden variables and biases, and that interpretations are not made too carelessly. Otherwise, we risk drawing a conclusion that may mislead us.

Results that do not fit our own perception of reality should therefore be inspected and checked to ensure that they have been correctly applied. Such detective work sometimes leads to quite interesting findings.

This article was first published by Traian Kaiser on Medium