The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

How To Dodge The Simpson's Paradox In Descriptive Analytics

May 30, 2018 at 8:56 AM

The Simpson's Paradox is a phenomenon in statistics illustrating how easy it is to misinterpret data. (Click to Tweet!) It occurs mainly in descriptive and diagnostic analytics (see our blog on the different types of analytics) where an analyst may jump to a conclusion driven by motivated reasoning and not by objectively assessing the evidence.

This blog is part of a series of blogs on how to avoid the logical fallacies and cognitive biases in data science.

Today we look at a famous example of the Simpson's paradox, and that is a study from the University of Berkley where admission records appeared to show that males are favoured over females.  When breaking it down by department, it seems that there was no noticeable difference in male over female admissions. 

Success Rates0Let's have a look at the numbers.  The table shows that the male applicants have a 47% success rate compared to 36% for female applicants. A rash conclusion, as admissions were managed at the departmental level, would be that male applicants are being favoured above female applicants.

Success Rates 1When one assesses the statistics at department level a different picture emerges.  Here not only does it appear that women enjoyed a higher success rate in four of the six departments, but the biggest differences between genders favours women (department I and VI).

So what's going on here?

Success RatesThe first thing one needs to do is to navigate away from the percentages and look specifically at the numbers. Department III shows that a much higher proportion of women applied to a department with a relatively low success rate whereas the men did not.  Conversely Department I had a high proportion of men apply with a relatively good success rate, but very few women applied despite a very high success rate.  

The overall conclusion was the fact that women applied in larger proportion to the departments where it was difficult to get in and in lower proportions to departments where it was easy to get in. There was no departmental bias it seems, just application biases.

Essential tips to avoid the Simpson Paradox

  1. Try and understand the base data (numbers) – i.e. avoid relying solely on percentages.
  2. Do not be swayed easily in concluding what you (or your boss) want to see in the numbers (motivated reasoning), instead conduct the full analytical exercise (try and blind/double-blind your analysis if you can)
  3. Read up on as many statistical paradoxes as you can. Your awareness of the statistical pitfalls will better prepare you to avoid them in your analysis. (Click to Tweet!)

contact us

Thomas Maydon
Thomas Maydon
Thomas Maydon is the Head of Credit Solutions at Principa. With over 13 years of experience in the Southern African, West African and Middle Eastern retail credit markets, Tom has primarily been involved in consulting, analytics, credit bureau and predictive modelling services. He has experience in all aspects of the credit life cycle (in multiple industries) including intelligent prospecting, originations, strategy simulation, affordability analysis, behavioural modelling, pricing analysis, collections processes, and provisions (including Basel II) and profitability calculations.

Latest Posts