The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

How To Avoid The Texas Sharpshooter Fallacy In Data Analysis

February 23, 2017 at 11:51 AM

The Texas Sharpshooter Fallacy written on a red barn wall next to a target with bullet holes

The rise of Big Data, data science and predictive analytics to help solve real world problems is just an extension of science marching on. Science is humanity’s tool for better understanding the world. The tools that we use to build models, test hypotheses, look for trends to build value with our brand all derive directly from scientific principles.

With these principles comes a myriad of obstacles. The obstacles are known to philosophers as “logical fallacies”, which I outlined in my previous post "The 7 Logical Fallacies to avoid in Data Analysis."  In this blog post, we focus on the Texas Sharpshooter Fallacy and how to avoid it in your data analysis.

What it is the Texas Sharpshooter Fallacy?

This is a common mistake made by human beings. In essence, it is looking at a large amount of data, identifying small patterns and deriving a conclusion based on the patterns.

The name derives from a story of a Texan marksman who shoots a large amount of bullets at a barn door. He then finds the closest cluster of bullets and draws a target around them and thereby claims that he is a sharpshooter.

How to avoid the Texas Sharpshooter Fallacy

Post-hoc hunting of anomalies and patterns is commonplace in data analytics. There is no real problem with identifying patterns in data through an observational study, but this should result in a hypothesis and not a conclusion. Hypotheses should then be tested against another set of data. To extend our metaphor, the marksman should, after drawing his target, go back and take aim to see whether he can hit the target again.

This is partly why we use hold-out samples when we build models (e.g. we build a model on a random 80% of the population, but will then test the model against the 20% hold-out).

The reality is that all data will have anomalies and we can hunt for these, but we should not rest our conclusions based on these anomalies, we should rather test our hypotheses about the anomalies on hold-out samples, out-of-time tests or new tests.

This logical flaw is well known in applied physics and epidemiology. Certain studies known as “observational studies” may be conducted to look for anomalies in data. These anomalies may be presented, but a conclusion is not drawn as the independent variable is not controlled for. A follow-up study would be a randomised controlled trial to determine whether the results of the observational study could be replicated.

The Texas Sharpshooter fallacy is just one of many statistical pitfalls to avoid in data analysis. Read my post on the 7 Logical Fallacies to avoid in Data Analysis. I'll be covering each logical fallacy covered in my initial blog post on this topic - The 10 Logical Fallacies to avoid in Data Analysis - so make sure to subscribe to our blog to read the new posts in this series.
Truthseeker - logical fallacies

Thomas Maydon
Thomas Maydon
Thomas Maydon is the Head of Credit Solutions at Principa. With over 13 years of experience in the Southern African, West African and Middle Eastern retail credit markets, Tom has primarily been involved in consulting, analytics, credit bureau and predictive modelling services. He has experience in all aspects of the credit life cycle (in multiple industries) including intelligent prospecting, originations, strategy simulation, affordability analysis, behavioural modelling, pricing analysis, collections processes, and provisions (including Basel II) and profitability calculations.

Latest Posts

Why A New Account Processing System Should Be A Priority For 2020

Recently Principa announced their partnership with UK-based Welcom Digital. Welcom Digital’s platform Financier is one of the leading account processing platforms. Principa is now the sales and delivery partner in South Africa for Financier. We speak to Principa’s Eric Hay – a technical specialist with over 25 years’ experience with credit systems include account processing systems – who has been now appointed technical delivery head for Financier.

  • 5 min read
  • November 11, 2019 at 2:59 PM

Africa’s Calling: Contact Centre Conference And Expo

We recently attended the Contact Centre Management Group (CCMG) Contact Centre Conference and Expo, themed Africa's Calling.

How Machine Learning Is Helping Call Centres Improve Their CX

The call centre world, unsurprisingly, ranks as one of the highest adopters of data analytics platforms year on year. This is largely due to the invaluable insights we gain through the analysis of thousands of calls received each day by the typical call centre. With speed being of the essence in making the right decision at the right time for each caller many call centres are turning to machine learning to automate their data analysis and make crucial customer experience decisions within seconds.