September 22, 2015 at 4:49 PM
It’s Man vs. Machine at Principa HQ as our data scientists apply predictive analytics and machine learning to predict the winners and spread of each match during the Rugby World Cup.
We signed up two internal teams of data scientists onto sports prediction site SuperBru.com as an exercise to put theory into play in this year’s Rugby World Cup. By applying the same principles used to predict customer behaviour for our financial services and retail clients, our two teams are vying against each other to develop algorithms and predictive models that can predict the outcome of the matches with the highest accuracy.
The purpose of the exercise is to see how well predictive analytics techniques used successfully in other areas can outperform the best human made predictions within the context of sport on SuperBru.com.
Analysing data from 6,000 rugby matches
Our two teams – Nero and Trojan - are looking at a wide range of data to predict as accurately as possible the win-lose outcome of each match and the margin between the two teams. The teams are considering data and information, such as recent performance, bookie odds, each team’s current world rankings, previous number of tries and even the value of individual players from a fantasy league.
They've looked at data from over 6,000 matches played by 99 teams going back 20 years to identify patterns that are highly probable to repeat in future. However, due to the many human variables in a match – such as injuries, penalties, referee calls, last-minute player decisions and even health - we may not be able to predict the exact scores, but with each match we will be able to provide an increasingly high probability of our predicted outcome.
As the objective of machine learning is to develop computer programs that can teach themselves to grow and change when exposed to new data, the two teams will be adjusting their algorithms with the results of each Rugby World Cup match to improve the accuracy of each prediction generated.
Our guys are learning from every match and adjust their models to account for new variables, including the teams’ coaches and their experience as was the case in the upset match between South Africa and Japan. Hopefully, we can get closer to predicting upsets as we enter new information such as this.
Some interesting insights gleaned from 20 years of rugby data
Some of the more interesting statistics and patterns they've discovered from analysing over 6,000 matches played by 99 teams since 1995 are:
- New Zealand have won 84% of the matches they have played compared to 66% wins by Australia, 62% wins by South Africa, and 61% by England;
- New Zealand scores an average of 36.9 points in a match compared to 28.9 by Australia, 28.5 by England and 28.4 by South Africa; and
- On average about 20% of all matches were upsets where a higher ranking team was defeated by a lower ranking team.
Check for the latest predictions and new rugby insights!
We've made a page available here with an overview of all the insights they have derived from the past 20 years of rugby.
We will also be posting each teams’ data-driven predictions before every match as well as via Twitter (@PrincipaD) and Facebook - so follow us, if you'd like to check our predictions before every match! You can check out our last set of predictions here.
It will be interesting to see how accurate our algorithms and models are in predicting the outcomes of the matches. In reality, relying on data insights without proper interpretation by subject-matter experts can lead to sub-optimal if not disastrous results. So, we encourage all you rugby fans to use our data-driven predictions as a starting point upon which to apply experience and your own knowledge - and a dash of “gut” feeling – to make your own predictions for each match!