The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Finding Value In Transaction Data: Part 1

December 3, 2015 at 2:16 PM

The once flat region South of Johannesburg is littered with man-made yellow hillocks. These hills are a reminder of the history of Johannesburg – a city literally built on gold. The gold has been mined here since the rush of the 1880s.

Much of the gold was extracted from the crushed rock and the left-overs transported by mules and later trucks to these locations. However, the mine-dumps (as they’re locally known) are yellow-coloured as a reminder that even after the original extract – allusive gold still exists in the waste piles.

A mine-dump in Elsburg has an estimated weight of 140 million tonnes. It is estimated that within the dump are specks of gold totaling 1.3 million ounces (valued at over a billion dollars). It is only in the last few years that we have developed the technology to recycle the mine-dumps to extract nearly 50% of the remaining gold. 

The story of the mine-dumps provides an apt analogy to the data world of today. Whether you’re in the banking, telecommunications, retail or insurance sectors, you have invariably witnessed the amassing of large amounts of data. Similar to a mine-dump, this data is speckled with valuable information. The conundrum is how to identify the valuable pieces and how to harness the value. In essence the goal is to convert your transaction data to information, then to knowledge and finally to wisdom.

principa_data-wisdom.png

This is also the foundation of an early approach to dealing with Big Data analysis.

How to understand your Big Data

Large data sets are growing at an ever increasing rate. Credit, loyalty and marketing managers recognise that there’s likely value in this data, but the question we hear regularly is, where to start?  

In this post, I’d like to propose a simple, but achievable approach. There are more advanced approaches that I’ll explore in future postings.

The process can be broken up into manageable parts incorporating the 3Ds: Determination, Development and Deployment.

1. The Determination phase 

  1. Incorporating the data
  2. Aggregating the data
  3. Identifying the target base areas of value
  4. Scouring the data for value
  5. Reviewing and planning mini-projects 

2. The Development phase 

Developing models (e.g. scorecards and decision trees) using valuable data for pre-determined outcomes 

3. The Deployment phase 

  1. Setting up a process
  2. Deployment of aggregation
  3. Deployment of models
  4. Deployment of treatments
  5. Continual measurement and adjustment

This approach has helped organisations worldwide to find and harness the glittering nuggets of value in their vast and ever-expanding dumps of data.

In the next few blogs, we will break down the 3 D's into the composite process. Read Part 2 of this series now looks at the 1st of the 3 Ds: Determination

Using machine learning in business - download guide

Image credits: Gallo; Principa

Thomas Maydon
Thomas Maydon
Thomas Maydon is the Head of Credit Solutions at Principa. With over 17 years of experience in the Southern African, West African and Middle Eastern retail credit markets, Tom has primarily been involved in consulting, analytics, credit bureau and predictive modelling services. He has experience in all aspects of the credit life cycle (in multiple industries) including intelligent prospecting, originations, strategy simulation, affordability analysis, behavioural modelling, pricing analysis, collections processes, and provisions (including Basel II) and profitability calculations.

Latest Posts

The time is NOW for model validation and adjustment.

One of the major premises used in credit scoring is that “the future is like the past”. It’s usually a rational assumption and gives us a reasonable platform on which to build scorecards whether they be application scorecards, behavioural scores, collection scores or financial models.  That is reasonable until something unprecedented comes along.  You can read about this black swan event in our previous two blogs here and here

10 ways the COVID-19 crisis will affect your credit models (PART 2)

This is the second of a 2-part blog. You can read the first blog here.

10 ways the COVID-19 crisis will affect your credit models (PART 1)

One of the basic principles of credit scoring and modelling is that the “future is like the past”.  Whilst robust credit models may be calibrated on multiple time periods, this assumes that trends in the past represent what is going on today.  COVID-19 is a black swan event – meaning in the modern day it really is unprecedented.  If you have never come across the term black swan, or if you have but no idea the origin, I recommend taking two minutes to read its really interesting etymology.