The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Can Data Science Predict The Oscar Winners?

February 17, 2016 at 7:37 AM

Close-up of Oscar trophies

Following a highly successful initiative of using Machine Learning to predict last year’s Rugby World Cup results, we're trying our hand again at predicting the future and revealing some interesting insights along the way about another major event: The Academy Awards, or the Oscars. 

During the 2015 Rugby World Cup, we successfully predicted the outcome of 91% of the matches played using Machine Learning and Predictive Analytics - out-performing 99.32% of users on sports prediction site, Superbru.com. Our team of data scientists are now applying the same principles toward predicting this year's Academy Award winners for Best Picture, Best Director, Best Actor and Best Actress. 

Read all our Media Coverage from our Oscar predictions here.

According to the team, our algorithms are predicting the following winners in the following four main categories:

  1. Best Picture: The Revenant - winning by a very slim margin over Spotlight
  2. Best Actor: Leonardo DiCaprio – winning by an 8 fold margin over Matt Damon
  3. Best Actress: Brie Larson – winning by a 6.7 fold margin over Saoirse Ronan
  4. Best Director: Alejandro González Iñárritu - winning by a very slim margin over George Miller

The Predictive Power of Bookies and Awards 

The team have collected and analysed data spanning back to 1935 to find patterns which could help identify strong predictors of the winners. Some of the best predictors identified have tended to be winning other awards, critics’ ratings and bookie odds. Other predictors have been genre and box office revenue before and after Oscar nominations. 

“Thanks to the predictive power we’ve observed within the bookie odds and previous wins, we can make our predictions with a high-level of confidence. Our predictions are similar to what experts are saying, with Best Actors and Best Actress awards almost a dead certainty and Best Director and Best Picture going either way due to the slim margin. However, the bookie odds are changing every day and as they hold high predictive power in our own predictions, our predictions may change as we get closer to the night of the Oscars. 

Learning from the Rugby World Cup

As with the Rugby World Cup initiative, we're using this as a way to test the applicability of the same data analytics principles used in determining the likelihood of an individual to pay back a loan or respond to a marketing campaign to another area: predicting the voting behaviour of 7,000 Academy Award members.

We've applied lessons learned from their Rugby World Cup initiative to improve the quality and accuracy of our Oscar predictions. We learned from our Rugby World Cup initiative that when we build different models off data sources and then combine them, our predictions tended to be more accurate. As a result, for our Oscars initiative we’ve built a model off 80 years of data and then a model off more recent data and combined the two instead of using one model. 

Truth is stranger than Fiction

One of the more interesting patterns and trends identified in the analysis of the data going back to 1935 has been a growing affinity towards films that are based on true stories. Over half of films nominated this year for an Oscar in any of the four main categories are based on a true story. This is the highest percentage of nominees in this genre to date and this percentage has steadily been increasing through the decades.

View the Infographic of Oscar Nominees and Winners based on True Stories

Regardless of how accurate our current predictions end up being on Oscar night, we have identified some very interesting patterns and trends, which reveal how our tastes are changing and how certain variables or characteristics tend to make up an Oscar winner. The data tells a story and as data scientists, we serve as story tellers.

We'll be sharing some of those trends, patterns and insights in the form of infographics via social media leading up to the night of the 88th Academy Awards. Follow us on Twitter  @PrincipaD for more infographics.

Image credit: Christopher Polk/Getty Images

Julian Diaz
Julian Diaz
Julian Diaz was Head of Marketing for Principa until 2017, after which he became Head of Marketing for Honeybee CRM. American born and raised, Julian has worked in the IT industry for over 20 years. Having begun his career at a major software company in Germany, Julian made the move to South Africa in 1998 when he joined Dimension Data and later MWEB (leading South African ISP). Since then, Julian has helped launch various South African technology brands into international markets, including Principa.

Latest Posts

[Slideshare] How To Make Your Business Data Work For You

Common barriers to success: Skills shortage: data scientists are in high demand and in low supply. Companies lack the skills to develop advanced data analytics or machine learning applications. Cost: recruiting and building up or training a team, as well as infrastructure costs are immense. Inefficiency and low ROI on: acquisition campaigns; re-activation and retention campaigns; outbound sales calls and debt collection. Resulting in: No or ineffective use of data. High cost to get insights from data. Low returns from campaigns. What’s the alternative? Machine Learning as a Service (MLaaS): removes infrastructure skills and requirements for machine learning, allowing you to begin benefiting from machine learning quickly with little investment. Subscription based pricing, allowing you to benefit using machine learning while minimising your set-up costs and seeing returns sooner. Answers as a Service: Use historic data and machine learning to allow answers to increase in accuracy with time. MLaaS with predictive models pre-developed to answers specific questions: Genius Call Connect: What is the best time and number to call customers? Genius Customer Growth: Which customers are most likely to respond to cross-sell? Genius Re-activation: Which dormant customers are worth re-activating? Genius Customer Retention: Which customers are most likely to churn? Genius Leads: Which contacts are likely to respond to my campaign? Genius Risk Classifier: Which debtors are most likely to pay or roll? Benefits of Genius: Quick and cost-effective ability to leverage machine learning: Minimal set-up time Minimal involvement from IT Subscription based service Looking to make your data work for your business? Read more on Genius to see how it can help your business succeed. 

5 Must-Join Facebook Pages For Data Science, Machine Learning And Artificial Intelligence In 2019

While LinkedIn has traditionally been thought of as the business or work focussed social platform, Facebook has been making headway into gaining market share in the space as well. With company pages and groups, Facebook is catering to every interest and aspiration that people might have – and combining that with their social interactions and news sources. Facebook aims to give users a one-stop-shop experience, and it’s very good at doing it.

Our 2018 Customer Acquisition And Engagement Blog Roundup

Our final roundup this year covers two of our main topics: customer acquisition and customer engagement. We’ve not covered these topics in depth this year, and so decided to combine these two to provide a roundup of the best of both.