The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Keeping Our Skills Fresh: Predicting The FIFA World Cup 2018 Results

May 22, 2018 at 10:24 AM

In 2015, we predicted the Rugby World Cup to great success, out predicting 99.68% of humans. In 2016, we predicted the results of the Oscars, accurately predicting DiCaprio’s first win. This year we'll be trying our hand at predicting the outcomes of the FIFA Football World Cup, and we're cautiously optimistic about our predictions.

View our Football World Cup 2018 predictions

Why “cautiously optimistic” after our previous success?

This is a whole new ballgame – literally! While a number of our data scientists are passionate football fans, we have never applied our skills to determine the outcome of a football match. We’ll also be sourcing public domain data and testing different techniques that are rarely used in our day-to-day analytics. On top of that, we have our previous successes to live up to!

Why are we using new techniques?

Our teams are very busy working wonders for our clients, and predicting the results of the World Cup is considered a fun activity, as well as a training exercise. They therefore don’t get as much time as they would like to spend on building their models. We put our clients first! But our data scientists had so much fun the previous time, and they love a challenge, which is why they are taking this one on. By opting to use this exercise as training, they won’t be using inconsequential data to hone their skills, but rather real match statistics, and training time is now filled with building their models for sports predictions.

But, back to the question – why are we using different techniques? Firstly, the team want to challenge themselves, and by using different techniques, they not only get to do so, but they also explore and hone new skills. Therefore, instead of using techniques that we apply on a daily basis, we’ll be trying something fresh!

Check out some of the interesting stats our team has found during their preparations! 

Which techniques will be used?

We've divided into three teams, who will all be using different techniques and going head-to-head in competition. The methods that each of the four teams will be using are:

Bayesian Inference

This technique can be used to enhance predictions by using what we already know (determined by looking at historic game results), with a recent sample of data to predict the likely outcome. In this way, recent performance and player statistics are used to enhance the predictions of models that are developed on historic data alone.

Multinomial Logistic Regression

A multinomial logistic regression model is merely an extension of a binary logistic regression model as it allows for more than two classes of the dependent variable. We will use a method of variable selection to choose which variables are significant in predicting the dependent variable, and that would be our independent variables for our model. The model will then give us the probabilities for each class (or goals scored). If we repeat this for the opponent team, we can logically arrive at the score of each team by choosing the class with the highest probabilities for each run of our Multinomial Logistic Regression model.

Poisson Regression Model

The Poisson distribution is a probability distribution that can be used to model data that can be counted, like the number of goals scored in a football match. This means we have a method of assigning probabilities to the number of goals in a game and from this, we can find probabilities for different match results. To be able to find the probabilities for different number of goals we would use the regression method, based on certain variables, such as the strength of the attack, ratings of the team etc.

If you'd like to see the predictions of each model, view our algorithm show-down!

Will our predictions perform better than the deaf cat?

In keeping with recent FIFA World Cup tradition, Russia has appointed an animal to predict their team’s outcomes: a deaf cat named Achilles, who will be choosing between two bowls of food, marked with the opposing teams. You might remember Paul the Octopus, who at the 2006 World Cup predicted host nation, Germany’s, matches with 100% accuracy. While we have three horses in the race with our three different techniques, we will ultimately be backing one result, and measuring our success against Russia’s deaf cat. High standards, indeed!

Will data analytics or animal succeed? Check back here, or follow our Twitter account to find out!

Contact Us to Discuss Your data analytics Business Requirements

Francel Mitchell
Francel Mitchell
Francel Mitchell is the Head of Decision Analytics at Principa. Francel’s team has a winning track record using descriptive, predictive and prescriptive analytical techniques within the financial services, marketing and loyalty sectors. Utilising available data and through the application of advanced analytical techniques, the team takes pride in their ability to predict human behaviour that can be used to assist business in making profitable decisions.

Latest Posts

Solving the Credit Unaware Challenge with Psychometrics

At Principa, we engage with clients and organisations across the entire credit lifecycle and track the focus of the South African credit industry. For nearly ten years the focus has consistently been in the collection space, but recently (since early 2021) this has changed and a large number of our clients are focused on acquisitions and originations.

Predicting Customer Behaviour (PART 2)

In Part One of this two-part blog, we started providing a short overview of just some of the propensity models that Principa has developed. In this Part Two, we continue to look at different types of propensity models available across the customer engagement lifecycle that are used to predict behaviour and solve business problems. 

PART 2: How to Cure the Post Pandemic “Collections” Symptoms

In PART 1 of this two-part series, we explored how the current socio-economic climate resulting from the lingering financial hangover caused by the pandemic is negatively impacting the consumer's ability to settle a debt.