The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Making The Move From Predictive Modelling To Machine Learning

March 23, 2017 at 2:45 PM

Everyone is wanting to learn more about how machine learning can be used in their business. What’s interesting though, is that many companies may already be using machine learning to some extent without really realising it. The lines between predictive analytics and machine learning are actually quite blurred. Many companies will have built up some machine learning capabilities using predictive analytics in some area of their business. So if you use static predictive models in your business, then you are already using machine learning, albeit of the static variety.  

The move from predictive modelling to machine learning can be easier than you think. However, before making that move you need to keep two key considerations in mind to ensure that you benefit from all that machine learning has to offer and that your predictive analytics system remains a trustworthy tool that lifts your business, rather than harming it: the Consequence of Failure and Retraining Frequency.

New to Machine Learning? Read our blog post What is Machine Learning?

1. Consequence of Failure

Within the predictive analytics space, trust is important. This is particularly relevant if you’ve developed a predictive analytics solution that makes some key decisions. If these decisions are made badly, the impact in financial terms can be significant. 

Let’s consider home loans, for example. Predictive models will score out new loan applications and - based on various attributes of the applicant - will assign a score that will reflect the probability of default on that loan. If the score is too low, the loan application is declined.  

Imagine if the scoring system starts to turn against you, i.e. the allocated scores do not accurately reflect the propensity to default, and more ‘bad’ accounts are accepted and ‘good’ accounts are rejected.  The impact for a large lending business would be significant, running into the millions.  This is why lending models that assign a probability of default are developed over months by specialist analysts, taking great care in the underlying data, the construct of the model and the validation of the model against out-of-sample and out-of-time populations. 

On the other side of the scale, efficiency models are put in place to obtain higher yield for the same effort.  The consequences may vary, but an example of a low consequence of failure would be an agent to customer model.  Agents and customers are profiled and their interactions are observed.  The model will pick out successful and unsuccessful interactions, and a ‘compatibility model’ will be developed and deployed into the dialler system.  Should the developed model stop performing, one would revert back to a random allocation of agents to customers and some efficiencies would be lost, but not millions as in the case of defaults on large loans.

In terms of other real world applications of high and low consequences of failure, a low consequence of failure would be a university student’s algorithm to navigate his robot safely through a set course using predictive analytics to guide it – the worst that could happen is the robot falls over.  On the other end of the spectrum we have the high consequence of failure of a self-driving algorithm to keep the vehicle on the correct side of the road. 

The consequence of failure assessment will have a strong bearing on the amount of TLC you apply to the development of the model, how visible it is for interrogation and also on how often you allow it to retrain. This brings us to the essence of what machine learning is: the retraining frequency.  

2. Retraining Frequency

If you visit the data science competitions website Kaggle, the example problem that they have for new starters is predicting the survivors and non-survivors from the Titanic from a sampled training dataset. This training dataset contains various attributes of the passengers, like the passenger’s ticket class, the boarding gate, number of siblings, age, gender, etc. One can build a model using whatever tool you prefer and test your algorithm on a sample and see how you faired against the best of the best. Don’t be disillusioned if your model only comes out at 76% accuracy against the winner’s 100%. One can just Google the actual names and see who lived and who did not.  But still, the point of this is that it is very much a static solution – there is no retraining of the algorithm using new data.  

Contrast this with the training of the self-driving algorithm which consumes real-time telemetric data and retrains the model on an ongoing basis. Once this model is retrained, the algorithm is locked down and deployed into production vehicles. One would not want this algorithm to be retrained in the real world.

Within financial services, most of the models are not real-time in nature. In most cases, the consequence of failure in financial terms normally falls in the high category. Even for efficiency models which may be considered to be low consequence of failure, we would not recommend redevelopment to be done in real-time. Don’t get us wrong, retraining can indeed strengthen your models, especially if they are operating in a dynamic, operational environment.  However, one still wants the ability to build, evaluate and test the model with some degree of care, which talks more to the static and not real-time retraining approach. 

So if you are already building static models to make your business more efficient, you are well positioned to use all you have learnt to springboard your business into the machine learning realm. There are some smart tricks that will allow you to bring in the benefits of machine learning into your existing models in a safe and reliable way, but more on that later.

Using machine learning in business - download guide

Robin Davies
Robin Davies
Robin Davies was the Head of Product Development at Principa for many years during which Robin’s team packaged complex concepts into easy-to-use products that help our clients to lift their business in often unexpected ways. Robin is currently the Head of Machine Learning at a prestigious firm in the UK.

Latest Posts

The Pros and Cons of a Multi-Bureau Strategy in Credit

Although not a new concept, very few lending organisations have deployed a true multi-bureau strategy (MBS). It is however talked about fairly regularly, but often dismissed as “too hard” or “not important enough”. So why should you consider a multi-bureau strategy? What are the key considerations? How do you go about deploying a MBS? This blog hopes to address all these questions.

PART 1: How to Cure the Post Pandemic “Collections” Symptoms

It has been a year and a half since the first case of the coronavirus (COVID-19) was reported from Wuhan, China. As we move into the third wave of the virus, there is an apparent dilution in both collection and recovery yields in the financial services sector, primarily because relief schemes and packages come to an end.

Predicting Customer Behaviour

Propensity modelling attempts to predict the likelihood that visitors, leads and customers will perform certain actions. It’s a statistical approach that accounts for all the independent and confounding variables that affects said behaviour. The propensity score, then, is the actual probability that the visitor, lead, or customer will perform a certain action.