February 15, 2017 at 3:17 PM
“Lies, damned lies and statistics” is the frequently quoted adage attributed to former British Prime Minister Benjamin Disraeli. The manipulation of data to fit a narrative is a very common occurrence from politics, economics to business and beyond.
In this blog post, we'll touch on the more common logical fallacies that can be encountered and should be avoided in data analysis.
Logical fallacies in data interpretation
Statistics are simply numbers – how we (choose to) interpret them is up to us mere mortals and the key mental tool of critical thinking. It’s time to think about thinking.
A core competency of a data scientist is to be able to translate effects and patterns of data into real-life context. For business managers who need to periodically analyse data, an essential skill is ensuring that an interpretation is indeed correct and not tarnished by one of the many logical fallacies or mental mistakes that we all make in everyday life.
Philosophers have for some time understood the mistakes we make in reasoning. Aristotle is the first known philosopher to have established a list of logical fallacies. These fallacies are common mistakes made in arguing and thinking. An awareness of these is extremely helpful in sharpening one’s analytical ability. Developing awareness of thinking is known as metacognition and it is a key component to critical thinking.
When it comes to analysing data or assessing a conclusion from the data or models presented, it is worthwhile to be aware of the reliability of what is presented. The reliability may vary due to the
- quality of the data,
- problems encountered in developing models (you’ll see some examples in a previous blog), and lastly, and
- the often ignored problems with interpreting the data or results.
Below is a list of the more common logical data fallacies to avoid in analytics - I'll by covering some of these in more detail in future blogs:
- The Cherry-picking fallacy of selectively choosing your data or statistics to prove your argument, or using confirmation bias and motivated reasoning instead of deductive reasoning in your analysis;
- The Texas Sharpshooter fallacy of looking for patterns, but ignoring contradictions;
- Correlation does not imply causation, or similarities between two statistics or trends does not imply that the one caused the other. Here are some funny examples of this;
- The Gambler's fallacy of looking at an unlikely string of events and implying that it will break;
- The Hot-hand fallacy is the opposite of the Gambler's fallacy: thinking an unlikely string of "luck" will continue to hold;
- The False Dichotomy or assuming/establishing a binary state when there is none;
- Lottery fallacy of questioning a result because it is highly improbable;
- Simpson's Paradox illustrating how easy it is to misinterpret data by jumping to conclusions driven by motivated reasoning and not by objectively assessing the evidence.
- P-hacking or identifying trends as statistically significant when they are not.
Subscribe to our blog to be notified of my future posts covering the above in more detail.