The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

How To Dodge The Simpson's Paradox In Descriptive Analytics

May 30, 2018 at 8:56 AM

The Simpson's Paradox is a phenomenon in statistics illustrating how easy it is to misinterpret data. (Click to Tweet!) It occurs mainly in descriptive and diagnostic analytics (see our blog on the different types of analytics) where an analyst may jump to a conclusion driven by motivated reasoning and not by objectively assessing the evidence.

This blog is part of a series of blogs on how to avoid the logical fallacies and cognitive biases in data science.

Today we look at a famous example of the Simpson's paradox, and that is a study from the University of Berkley where admission records appeared to show that males are favoured over females.  When breaking it down by department, it seems that there was no noticeable difference in male over female admissions. 

Success Rates0Let's have a look at the numbers.  The table shows that the male applicants have a 47% success rate compared to 36% for female applicants. A rash conclusion, as admissions were managed at the departmental level, would be that male applicants are being favoured above female applicants.

Success Rates 1When one assesses the statistics at department level a different picture emerges.  Here not only does it appear that women enjoyed a higher success rate in four of the six departments, but the biggest differences between genders favours women (department I and VI).

So what's going on here?

Success RatesThe first thing one needs to do is to navigate away from the percentages and look specifically at the numbers. Department III shows that a much higher proportion of women applied to a department with a relatively low success rate whereas the men did not.  Conversely Department I had a high proportion of men apply with a relatively good success rate, but very few women applied despite a very high success rate.  

The overall conclusion was the fact that women applied in larger proportion to the departments where it was difficult to get in and in lower proportions to departments where it was easy to get in. There was no departmental bias it seems, just application biases.

Essential tips to avoid the Simpson Paradox

  1. Try and understand the base data (numbers) – i.e. avoid relying solely on percentages.
  2. Do not be swayed easily in concluding what you (or your boss) want to see in the numbers (motivated reasoning), instead conduct the full analytical exercise (try and blind/double-blind your analysis if you can)
  3. Read up on as many statistical paradoxes as you can. Your awareness of the statistical pitfalls will better prepare you to avoid them in your analysis. (Click to Tweet!)

Truthseeker - logical fallacies

Thomas Maydon
Thomas Maydon
Thomas Maydon is the Head of Credit Solutions at Principa. With over 13 years of experience in the Southern African, West African and Middle Eastern retail credit markets, Tom has primarily been involved in consulting, analytics, credit bureau and predictive modelling services. He has experience in all aspects of the credit life cycle (in multiple industries) including intelligent prospecting, originations, strategy simulation, affordability analysis, behavioural modelling, pricing analysis, collections processes, and provisions (including Basel II) and profitability calculations.

Latest Posts

Why A New Account Processing System Should Be A Priority For 2020

Recently Principa announced their partnership with UK-based Welcom Digital. Welcom Digital’s platform Financier is one of the leading account processing platforms. Principa is now the sales and delivery partner in South Africa for Financier. We speak to Principa’s Eric Hay – a technical specialist with over 25 years’ experience with credit systems include account processing systems – who has been now appointed technical delivery head for Financier.

  • 5 min read
  • November 11, 2019 at 2:59 PM

Africa’s Calling: Contact Centre Conference And Expo

We recently attended the Contact Centre Management Group (CCMG) Contact Centre Conference and Expo, themed Africa's Calling.

How Machine Learning Is Helping Call Centres Improve Their CX

The call centre world, unsurprisingly, ranks as one of the highest adopters of data analytics platforms year on year. This is largely due to the invaluable insights we gain through the analysis of thousands of calls received each day by the typical call centre. With speed being of the essence in making the right decision at the right time for each caller many call centres are turning to machine learning to automate their data analysis and make crucial customer experience decisions within seconds.