The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Finding Value In Transaction Data: Part 2

December 9, 2015 at 9:50 AM

In the first part of our series on "Finding Value in Transaction Data" we explored a problem that is encountered by many organisations – how to identify and extract value out of the ever growing amount of data. 

I proposed a 3 step approach (the 3D approach) to realising value from a variety of large data sets:

  1. Determination –scour data sources to establish if and where there might be value
  2. Development – create models that will be developed for the decision areas where value was identified with data that is predictive of this predetermined outcome
  3. Deployment – implement and run the developed market

In this post, we will explore the 1st of the 3D approach, “Determination”.

How to eat an elephant..

A frequent question that we hear clients asking is, “I have loads of data; I know there must be value in the data, but I don’t know where to start!”

A mistake that is often made is that managers assume that the first steps in tackling this problem should result in a tangible outcome or product.

The problem is actually a lot bigger than this. What is more prudent is that an exploratory exercise is initiated to determine what value there is, for what purpose and in what area. This is the “determination phase”.

The determination phase involves the identification of valuable data within the data set(s).

The determination phase can be divided into the following steps:


1. Incorporating the data

Today credit granters, customer managers and marketers have access to a plethora of data sources both internal and external. The first step in the “Determination” phase is deciding which data source to explore while being aware that the data should be in a usable state once a solution is deployed. Types of data that might be available include:

  • credit bureau,
  • customer demographic,
  • internal behavioural,
  • transactional (e.g. retail purchases, mobile telemetry),
  • geo-data,
  • store data,
  • social-media data, to name a few.

This data should be selected and sourced and then linked to the other sources of data typically by customer ID, store codes, and customer numbers. In this way the data can be linked across data sources which will be critical in determining where the data might add value.

Data are also comprised of structured and unstructured data. Structured data are typically fixed fields that can be easily grouped, analysed and modelled upon. Unstructured data are the rest, e.g. free-text, such as Twitter tweets. A different sort of analytics is required to assess free-text.

In this step of the process, data-cleansing is also essential. This involves the identification of valid (clean) data, the adjusting of data to make it usable and the understanding of the data universe to be analysed.

Once data is linked within a relational database or in a single file format and it has been cleaned, aggregation can take place.

2. Aggregation

Aggregation is essential for analysing transactional or behavioural type data where trends need to be measured. Raw data typically lists single events which may have a degree of value. However, there is more value in single events when they are included in a group of events and measured in relationship to other events or over a period of time.

Aggregation is what credit bureaux have been doing for years, but similar work should be done with aggregated data.

Example of aggregated fields across various industries:

  1. Number of SMS’s sent in the last month
  2. Cleaning products purchased as a percentage of all purchases this month
  3. Highest till-slip value in the last six months
  4. Average monthly spend in the last three months
  5. Minimum value of products viewed online

Within the different transactional sets – such as credit card, fashion purchases, mobile data, and e-commerce data - a variety of aggregation is possible. The key is to follow a methodical approach through event classification. For example, in transactional fashion retail:

  1. Categories (high/med/low fashion, men/women/children, clothing/apparel/other, premium/average/sale pricing, till-slip value, number of shopping events, store name)
  2. Time periods (1d, 7d, 1m, 3m, 6m, 12m)
  3. Metrics (number, average, maximum, minimum, worst, percentage)
Aggregation will often use a combination of two or three of the above categories.

Aggregation can be coded in a programme like R, SAS, MSSQL. Ultimately an aggregation process will be required when a solution goes live.

3. Identifying the target areas of value

Another important step in Determination is to identify target areas of value. A long list of outcome target areas can be identified with very little additional analytical effort. Areas of interest include:

  1.  Probability of attrition/churn (in the next 1m/2m/3m)
  2. Probability of missing a payment/rolling (in the next 1m)
  3. Probability of missing three payments (in the next 6m/12m)
  4. Probability of increasing spend (by 20%/50%, e.g.)
  5. Propensity to take up a cross-sell or up-sell offer (1m/2m)
  6. Propensity to increase wallet-share (i.e. spend as percentage of spend at competitors)
  7. Propensity to make an (insurance) claim (1m/2m/3m)
  8. Propensity to migrate to a high value segment/cluster (1m/2m/3m)

Once these areas of value have been identified and the time period set, you’ll be ready to aggregate your data.

Stay tuned for my next blog when we look at the final 2 steps in the Determination phase of finding value in your transaction data.

Note: Your aggregated/observational data may have to be a few months old to allow for significant time between observation and outcome.

Image Credit: Freepik

Thomas Maydon
Thomas Maydon
Thomas Maydon is the Head of Credit Solutions at Principa. With over 13 years of experience in the Southern African, West African and Middle Eastern retail credit markets, Tom has primarily been involved in consulting, analytics, credit bureau and predictive modelling services. He has experience in all aspects of the credit life cycle (in multiple industries) including intelligent prospecting, originations, strategy simulation, affordability analysis, behavioural modelling, pricing analysis, collections processes, and provisions (including Basel II) and profitability calculations.

Latest Posts

[Slideshare] How To Make Your Business Data Work For You

Common barriers to success: Skills shortage: data scientists are in high demand and in low supply. Companies lack the skills to develop advanced data analytics or machine learning applications. Cost: recruiting and building up or training a team, as well as infrastructure costs are immense. Inefficiency and low ROI on: acquisition campaigns; re-activation and retention campaigns; outbound sales calls and debt collection. Resulting in: No or ineffective use of data. High cost to get insights from data. Low returns from campaigns. What’s the alternative? Machine Learning as a Service (MLaaS): removes infrastructure skills and requirements for machine learning, allowing you to begin benefiting from machine learning quickly with little investment. Subscription based pricing, allowing you to benefit using machine learning while minimising your set-up costs and seeing returns sooner. Answers as a Service: Use historic data and machine learning to allow answers to increase in accuracy with time. MLaaS with predictive models pre-developed to answers specific questions: Genius Call Connect: What is the best time and number to call customers? Genius Customer Growth: Which customers are most likely to respond to cross-sell? Genius Re-activation: Which dormant customers are worth re-activating? Genius Customer Retention: Which customers are most likely to churn? Genius Leads: Which contacts are likely to respond to my campaign? Genius Risk Classifier: Which debtors are most likely to pay or roll? Benefits of Genius: Quick and cost-effective ability to leverage machine learning: Minimal set-up time Minimal involvement from IT Subscription based service Looking to make your data work for your business? Read more on Genius to see how it can help your business succeed. 

5 Must-Join Facebook Pages For Data Science, Machine Learning And Artificial Intelligence In 2019

While LinkedIn has traditionally been thought of as the business or work focussed social platform, Facebook has been making headway into gaining market share in the space as well. With company pages and groups, Facebook is catering to every interest and aspiration that people might have – and combining that with their social interactions and news sources. Facebook aims to give users a one-stop-shop experience, and it’s very good at doing it.

Our 2018 Customer Acquisition And Engagement Blog Roundup

Our final roundup this year covers two of our main topics: customer acquisition and customer engagement. We’ve not covered these topics in depth this year, and so decided to combine these two to provide a roundup of the best of both.