The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

The 10 Logical Data Fallacies To Avoid In Data Analysis

February 15, 2017 at 3:17 PM

digital-thinker.png“Lies, damned lies and statistics” is the frequently quoted adage attributed to former British Prime Minister Benjamin Disraeli. The manipulation of data to fit a narrative is a very common occurrence from politics, economics to business and beyond. 

In this blog post, we'll touch on the more common logical fallacies that can be encountered and should be avoided in data analysis.

Logical fallacies in data interpretation

Statistics are simply numbers – how we (choose to) interpret them is up to us mere mortals and the key mental tool of critical thinking. It’s time to think about thinking.

A core competency of a data scientist is to be able to translate effects and patterns of data into real-life context. For business managers who need to periodically analyse data, an essential skill is ensuring that an interpretation is indeed correct and not tarnished by one of the many logical fallacies or mental mistakes that we all make in everyday life.

Philosophers have for some time understood the mistakes we make in reasoning. Aristotle is the first known philosopher to have established a list of logical fallacies. These fallacies are common mistakes made in arguing and thinking. An awareness of these is extremely helpful in sharpening one’s analytical ability. Developing awareness of thinking is known as metacognition and it is a key component to critical thinking. 

When it comes to analysing data or assessing a conclusion from the data or models presented, it is worthwhile to be aware of the reliability of what is presented. The reliability may vary due to the

Below is a list of the more common logical data fallacies to avoid in analytics - I'll by covering some of these in more detail in future blogs:

  1. The Cherry-picking fallacy of selectively choosing your data or statistics to prove your argument, or using confirmation bias and motivated reasoning instead of deductive reasoning in your analysis;
  2. The Texas Sharpshooter fallacy of looking for patterns, but ignoring contradictions;
  3. Correlation does not imply causation, or similarities between two statistics or trends does not imply that the one caused the other. Here are some funny examples of this;
  4. The Gambler's fallacy of looking at an unlikely string of events and implying that it will break; 
  5. The Hot-hand fallacy is the opposite of the Gambler's fallacy: thinking an unlikely string of "luck" will continue to hold;
  6. The False Dichotomy or assuming/establishing a binary state when there is none;
  7. Personal Incredulity, not trusting results you don't understand;
  8. Simpson's Paradox illustrating how easy it is to misinterpret data by jumping to conclusions driven by motivated reasoning and not by objectively assessing the evidence;  
  9. P-hacking or identifying trends as statistically significant when they are not;
  10. The Cobra Effect, when an action leads to the opposite of the intended consequences.  

Subscribe to our blog to be notified of my future posts covering the above in more detail.

Truthseeker - logical fallacies

Thomas Maydon
Thomas Maydon
Thomas Maydon is the Head of Credit Solutions at Principa. With over 17 years of experience in the Southern African, West African and Middle Eastern retail credit markets, Tom has primarily been involved in consulting, analytics, credit bureau and predictive modelling services. He has experience in all aspects of the credit life cycle (in multiple industries) including intelligent prospecting, originations, strategy simulation, affordability analysis, behavioural modelling, pricing analysis, collections processes, and provisions (including Basel II) and profitability calculations.

Latest Posts

The time is NOW for model validation and adjustment.

One of the major premises used in credit scoring is that “the future is like the past”. It’s usually a rational assumption and gives us a reasonable platform on which to build scorecards whether they be application scorecards, behavioural scores, collection scores or financial models.  That is reasonable until something unprecedented comes along.  You can read about this black swan event in our previous two blogs here and here

10 ways the COVID-19 crisis will affect your credit models (PART 2)

This is the second of a 2-part blog. You can read the first blog here.

10 ways the COVID-19 crisis will affect your credit models (PART 1)

One of the basic principles of credit scoring and modelling is that the “future is like the past”.  Whilst robust credit models may be calibrated on multiple time periods, this assumes that trends in the past represent what is going on today.  COVID-19 is a black swan event – meaning in the modern day it really is unprecedented.  If you have never come across the term black swan, or if you have but no idea the origin, I recommend taking two minutes to read its really interesting etymology.