September 22, 2015 at 2:00 PM
It’s Man vs. the Machine as South African based data analytics company, Principa, apply predictive analytics and machine learning to predict the winners and spread of each match during the Rugby World Cup.
South African based data analytics company, Principa, have signed up two internal teams of data scientists onto sports prediction site SuperBru.com as an exercise to put theory into play in this year’s Rugby World Cup. By applying the same principles used to predict customer behaviour for the company’s financial services and retail clients, two teams of data scientists are vying against each other to develop algorithms and predictive models that can predict the outcome of the matches with the highest accuracy.
The purpose of the exercise is to see how well predictive analytics techniques used successfully in other areas can outperform the best human made predictions within the context of sport on SuperBru.com.
The two Principa teams – Nero and Trojan - are looking at a wide range of data to predict as accurately as possible the win-lose outcome of each match and the margin between the two teams. The teams are considering data and information, such as recent performance, bookie odds, each team’s current world rankings, previous number of tries and even the value of individual players from a fantasy league.
“Our teams of data scientists have looked at data from over 6,000 matches played by 99 teams going back 20 years to identify patterns that are highly probable to repeat in future,” says Jaco Rossouw, Director of Delivery at Principa. “However, due to the many human variables in a match – such as injuries, penalties, referee calls, last-minute player decisions and even health - we may not be able to predict the exact scores, but with each match we will be able to provide an increasingly high probability of our predicted outcome.”
As the objective of machine learning is to develop computer programs that can teach themselves to grow and change when exposed to new data, the two teams’ algorithms will be adjusted with the results of each Rugby World Cup match to improve the accuracy of each prediction generated.
Some of the more interesting statistics and patterns the data scientists at Principa have discovered from analysing over 6,000 matches played by 99 teams since 1995 are:
- New Zealand have won 84% of the matches they have played compared to 66% wins by Australia, 62% wins by South Africa, and 61% by England;
- New Zealand scores an average of 36.9 points in a match compared to 28.9 by Australia, 28.5 by England and 28.4 by South Africa; and
- On average about 20% of all matches were upsets where a higher ranking team was defeated by a lower ranking team.
Principa have made a page available online with an overview of all the insights they have derived from the past 20 years of rugby at http://wonders.principadata.com/rugby-world-cup-2015-rugby-statistics-from-1995-2015 .
The company will also be posting their teams’ data-driven predictions before every match on their website as well as via their Twitter (@PrincipaD) and Facebook accounts. The last set of predictions can be viewed here.
“It will be interesting to see how accurate our algorithms and models are in predicting the outcomes of the matches. In reality, relying on data insights without proper interpretation by subject-matter experts can lead to sub-optimal if not disastrous results. So, we encourage rugby fans to use our data-driven predictions as a starting point upon which to apply experience and their own knowledge - and a dash of “gut” feeling – to make their own predictions for each match,” concludes Rossouw.