The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Keeping Our Skills Fresh: Predicting The FIFA World Cup 2018 Results

May 22, 2018 at 10:24 AM

In 2015, we predicted the Rugby World Cup to great success, out predicting 99.68% of humans. In 2016, we predicted the results of the Oscars, accurately predicting DiCaprio’s first win. This year we'll be trying our hand at predicting the outcomes of the FIFA Football World Cup, and we're cautiously optimistic about our predictions.

View our Football World Cup 2018 predictions

Why “cautiously optimistic” after our previous success?

This is a whole new ballgame – literally! While a number of our data scientists are passionate football fans, we have never applied our skills to determine the outcome of a football match. We’ll also be sourcing public domain data and testing different techniques that are rarely used in our day-to-day analytics. On top of that, we have our previous successes to live up to!

Why are we using new techniques?

Our teams are very busy working wonders for our clients, and predicting the results of the World Cup is considered a fun activity, as well as a training exercise. They therefore don’t get as much time as they would like to spend on building their models. We put our clients first! But our data scientists had so much fun the previous time, and they love a challenge, which is why they are taking this one on. By opting to use this exercise as training, they won’t be using inconsequential data to hone their skills, but rather real match statistics, and training time is now filled with building their models for sports predictions.

But, back to the question – why are we using different techniques? Firstly, the team want to challenge themselves, and by using different techniques, they not only get to do so, but they also explore and hone new skills. Therefore, instead of using techniques that we apply on a daily basis, we’ll be trying something fresh!

Check out some of the interesting stats our team has found during their preparations! 

Which techniques will be used?

We've divided into three teams, who will all be using different techniques and going head-to-head in competition. The methods that each of the four teams will be using are:

Bayesian Inference

This technique can be used to enhance predictions by using what we already know (determined by looking at historic game results), with a recent sample of data to predict the likely outcome. In this way, recent performance and player statistics are used to enhance the predictions of models that are developed on historic data alone.

Multinomial Logistic Regression

A multinomial logistic regression model is merely an extension of a binary logistic regression model as it allows for more than two classes of the dependent variable. We will use a method of variable selection to choose which variables are significant in predicting the dependent variable, and that would be our independent variables for our model. The model will then give us the probabilities for each class (or goals scored). If we repeat this for the opponent team, we can logically arrive at the score of each team by choosing the class with the highest probabilities for each run of our Multinomial Logistic Regression model.

Poisson Regression Model

The Poisson distribution is a probability distribution that can be used to model data that can be counted, like the number of goals scored in a football match. This means we have a method of assigning probabilities to the number of goals in a game and from this, we can find probabilities for different match results. To be able to find the probabilities for different number of goals we would use the regression method, based on certain variables, such as the strength of the attack, ratings of the team etc.

If you'd like to see the predictions of each model, view our algorithm show-down!

Will our predictions perform better than the deaf cat?

In keeping with recent FIFA World Cup tradition, Russia has appointed an animal to predict their team’s outcomes: a deaf cat named Achilles, who will be choosing between two bowls of food, marked with the opposing teams. You might remember Paul the Octopus, who at the 2006 World Cup predicted host nation, Germany’s, matches with 100% accuracy. While we have three horses in the race with our three different techniques, we will ultimately be backing one result, and measuring our success against Russia’s deaf cat. High standards, indeed!

Will data analytics or animal succeed? Check back here, or follow our Twitter account to find out!

Contact Us to Discuss Your Business Requirements

Francel Mitchell
Francel Mitchell
Francel Mitchell is the Head of Decision Analytics at Principa. Francel’s team has a winning track record using descriptive, predictive and prescriptive analytical techniques within the financial services, marketing and loyalty sectors. Utilising available data and through the application of advanced analytical techniques, the team takes pride in their ability to predict human behaviour that can be used to assist business in making profitable decisions.

Latest Posts

Incorporating Credit Lifecycle Predictive Outcomes In Your Collections And Recoveries Call Centre

In a collections environment, an agent needs to follow up with numerous customers on their outstanding credit and the more distinct information the agent has on each customer, the better the agent will understand who they are interacting with and what the opportunities, risks and expectation of the collections call with the client are.

[Slideshare] How To Make Your Business Data Work For You

Common barriers to success: Skills shortage: data scientists are in high demand and in low supply. Companies lack the skills to develop advanced data analytics or machine learning applications. Cost: recruiting and building up or training a team, as well as infrastructure costs are immense. Inefficiency and low ROI on: acquisition campaigns; re-activation and retention campaigns; outbound sales calls and debt collection. Resulting in: No or ineffective use of data. High cost to get insights from data. Low returns from campaigns. What’s the alternative? Machine Learning as a Service (MLaaS): removes infrastructure skills and requirements for machine learning, allowing you to begin benefiting from machine learning quickly with little investment. Subscription based pricing, allowing you to benefit using machine learning while minimising your set-up costs and seeing returns sooner. Answers as a Service: Use historic data and machine learning to allow answers to increase in accuracy with time. MLaaS with predictive models pre-developed to answers specific questions: Genius Call Connect: What is the best time and number to call customers? Genius Customer Growth: Which customers are most likely to respond to cross-sell? Genius Re-activation: Which dormant customers are worth re-activating? Genius Customer Retention: Which customers are most likely to churn? Genius Leads: Which contacts are likely to respond to my campaign? Genius Risk Classifier: Which debtors are most likely to pay or roll? Benefits of Genius: Quick and cost-effective ability to leverage machine learning: Minimal set-up time Minimal involvement from IT Subscription based service Looking to make your data work for your business? Read more on Genius to see how it can help your business succeed. 

5 Must-Join Facebook Pages For Data Science, Machine Learning And Artificial Intelligence In 2019

While LinkedIn has traditionally been thought of as the business or work focussed social platform, Facebook has been making headway into gaining market share in the space as well. With company pages and groups, Facebook is catering to every interest and aspiration that people might have – and combining that with their social interactions and news sources. Facebook aims to give users a one-stop-shop experience, and it’s very good at doing it.