Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.
Principa employs a variety of best-practice credit scorecard building techniques including mathematical programming, regression modelling, optimal segmentation-seek genetic algorithms and reject inference parceling, amongst others. Through our credit risk scorecards businesses can look to improving their credit risk decisioning by 5-30%.
Peoples habits are changing - Are you adapting?
This is the second of a 2-part blog. You can read the first blog here.
One of the basic principles of credit scoring and modelling is that the “future is like the past”. Whilst robust credit models may be calibrated on multiple time periods, this assumes that trends in the past represent what is going on today. COVID-19 is a black swan event – meaning in the modern day it really is unprecedented. If you have never come across the term black swan, or if you have but no idea the origin, I recommend taking two minutes to read its really interesting etymology.
As a data analytics company, we write about data analytics frequently: but less so on the cloud. However, as software product experts, one of our blogs on a cloud-related topic was so popular this year, we would feel terrible if you were to miss it, and so have decided to add together our data analytics and cloud topics together in this roundup.
Recently, South Africa was faced with the threat of the biggest banking strike in our country’s history, driven by the job cuts as a result of increasing automation.
Whether you’ve been involved in introducing models into your business or have had a passing interest in economic affairs, you may have come across the term “Gini-coefficient”. This blog hopes to demystify the concept and give you a good deal of information on the statistical measurement. We answer:
For businesses wishing to improve their credit decisions, the adoption of Mathematical Optimisation is an important consideration. Mathematical optimisation is more than a straight data-driven strategy design as it incorporates prescriptive analytics.
We have released a new eBook titled Truth Seeker: a guide to avoiding logical fallacies and cognitive biases in data science.
We’re looking forward to attending this year’s Evolution of Data Science in Banking conference. The 2019 conference will be held on the 5th and 6th of June at the Indaba Hotel, Fourways, Johannesburg. The event will explore the use of data and analytical techniques to help financial services providers meet regulatory and reporting requirements. Also to be discussed is how running analytics at a product level can provide a more holistic view of customers across their portfolios.
For a while, we have been running a blog series on cognitive biases and logical fallacies that data scientists should avoid. In this final blog on the subject, we look at some of the other logical fallacies and how they might crop up in data analytics.
With LinkedIn usage growing by two new members every second, you simply can’t afford to not be on the platform. Founded in 2003, LinkedIn has 590 million users with 260 million of those active every month.
The EQ Behind the IQ - Jaco Rossouw from PrincipaDecisions
For a while, we have been running a blog series on cognitive biases and logical fallacies that data scientists should avoid. In philosophy there are a host of informal logical fallacies – essentially errors in thinking – that crop up every day. In this series we have looked at the practice of data science to determine how these same fallacies also occur. Today we will be looking at fallacies and their manifestation in credit: The Monte-Carlo fallacy and the Hot-hand fallacy with some studies in the credit world.
Recently my team and I were sitting in a meeting with a potential client debating the basic functions of our originations software. To the business analysts who were leading the RFP process, the most critical feature seemed to be whether or not our solution would be able to offer web form fields that were customisable by the business user.
While LinkedIn has traditionally been thought of as the business or work focussed social platform, Facebook has been making headway into gaining market share in the space as well. With company pages and groups, Facebook is catering to every interest and aspiration that people might have – and combining that with their social interactions and news sources. Facebook aims to give users a one-stop-shop experience, and it’s very good at doing it.
For the second roundup of our most popular 2018 blogs, we cover the Data Analytics topic. The blogs that our readers love are thought-provoking and aim to inspire and teach – and we hope these articles do just that!
It isn't always easy to keep up-to-date with the latest news in data science, machine learning or artificial intelligence. Twitter is a great source of information and helps you quickly scan through headlines to engage with content that interests you. This helps eliminate a lot of noise and helps you focus on what you want to read about, but you need to follow the right people for that content to appear in your feed.
Meetup (www.meetup.com) was founded in 2002 and is an online service used to organise groups that host in-person events for people with similar interests. A social networking tool, unlike any other, Meetup, has more than 35 million users in 180 countries. In South Africa, there are many meetups organised for various interests. Of course, our interests span data science, machine learning and AI, so we took the time to put together a (non-comprehensive) list of the groups who organise meetups that we (and hopefully you) find of interest. We hope you make it out to one of these groups meetups soon!
Our data scientists are keen readers and avid podcast listeners. In this blog, we list a few of the podcasts that cover topics such as data science, machine learning and artificial intelligence, and that we’d recommend if you’re looking to start exploring the world of podcasts.
In this blog, we’ve created a (non-exhaustive) list of courses you should consider if you want to learn essential data science skills in South Africa. These courses are mostly classroom training from South African institutions, but if you’re more interested in online learning, check out our blog Where To Learn Essential Data Science Skills Online.
The value and benefits of becoming a data scientist or picking up basic data science skills, cannot be overstated in today’s world. Businesses across all industries are starting to embrace data analytics and those who aren’t will soon feel the advantage gained by their competitors who are.
Learning rarely stops after your formal education ends, whether you’re pursuing further learning out of personal interest, career-aspiration or it’s mandated by your company. Most companies offer funding and support for their employees to go on courses to keep their skills up to date or learn new skills. Companies spend millions every year on enabling employees to participate in physical, often off-site training, and the costs can cover training fees, training material, travel and accommodation.
The Simpson's Paradox is a phenomenon in statistics illustrating how easy it is to misinterpret data. (Click to Tweet!) It occurs mainly in descriptive and diagnostic analytics (see our blog on the different types of analytics) where an analyst may jump to a conclusion driven by motivated reasoning and not by objectively assessing the evidence.
As part of our blog series on cognitive biases and logical fallacies that data scientists should avoid, today we address a prevalent logical fallacy: the "correlation proves causation" fallacy. Correlation due to causation is just one of the five main categories of causation, and this blog will look into each of the five.
During the last year, we’ve experienced the escalation of social issues around artificial intelligence (AI), with Elon Musk leading the charge. Musk continues to advocate the idea that humanity is getting closer to a Skynet-like future – to many people’s concern. One of the very real and valid concerns is the idea that many existing jobs will be automated, thanks to AI.
The Principa brand is unique and memorable as it is associated with creativity, integrity, innovation and deep expertise. Our website and blogs embodies every extraordinary aspect of our brand and we are very proud of it. We bring the same remarkable elements from this digital world, into our offices and into every interaction we have. Apart from the Principa team and our clients with whom we have an established relationship, not many people understand the elements that define our brand. I’d like to take this opportunity to introduce you to the wonders behind Principa.
“We must develop a comprehensive and globally shared view of how technology is affecting our lives and reshaping our economic, social, cultural, and human environments. There has never been a time of greater promise, or greater peril.” - Klaus Schwab, Founder and Executive Chairman, World Economic Forum (Click to Tweet!)
A year ago, I published an article about motivated reasoning and how that can damage the data analytics process. It is part of a blog series on cognitive biases and logical fallacies that data analysts should avoid. Today I’d like to extend this conversation into a topical matter: p-hacking, also known as data fishing.
With the release of the 2018 Predictions at the end of last year, Forrester forecast an uncertain fate for retailers who were laggards in digital transformation and those immune to obsessing over customer experience. One without the other will result in an equally disappointing outcome.
The fourth industrial revolution, much like the first three, has the potential to increase income levels and improve quality of life across the globe. Something to look forward to, but what exactly is it?
We chat to Principa's Chief Executive Officer, Jaco Rossouw, about the thrilling new world of data and how businesses can work wonders with data-driven insights.
Data science continues to be a hot topic in many large firms globally. 2017 saw data science subjects such as R vs. Python, deep learning, natural language, gamification, AI and machine learning being arguably the most topical.
As a company passionate about innovation we are regularly evaluating and re-evaluating knowledge – whether it’s our collective own, an employee’s or a client’s. Knowledge is a funny thing. Common sense might suggest that the more one learns about a subject the more confident one becomes. However, this is not entirely true, at least not in the beginning. The Dunning-Kruger Effect The Dunning-Kruger effect (DKE) is a cognitive bias that has been known for some time, but was only formalised in 1999 by two Cornell psychologists. It involves the seemingly contradictory idea that often those with little knowledge on a subject may come across as exceedingly confident about the subject. Conversely those with more knowledge may be less confident.
Believe it or not, we are halfway through 2017 and if you're feeling like you're no where near achieving what you set out to achieve this year, I'm sure you're not alone. But fear not! If one of your resolutions this year was to research how to apply data analytics or machine learning to your area of specialisation - be it Marketing, Customer Experience, Debt Collection or Risk Management - you still have time! And our Data Analytics Blog is a good place to start. I've looked at the stats and compiled our Top 10 list of most read blog posts for the first half of 2017. Check out our list of blog posts below and see what topics your colleagues and industry counterparts are researching this year:
Use versus abuse of statistics can often be characterised by the analytical approach adopted to the problem at hand. In this blog post, which is part of a series on Logical Fallacies to avoid in Data Analysis, I’ll be focusing on defining the motivated reasoning logical fallacy and how to avoid it in data analysis.
The rise of Big Data, data science and predictive analytics to help solve real world problems is just an extension of science marching on. Science is humanity’s tool for better understanding the world. The tools that we use to build models, test hypotheses, look for trends to build value with our brand all derive directly from scientific principles. With these principles comes a myriad of obstacles. The obstacles are known to philosophers as “logical fallacies”, which I outlined in my previous post "The 7 Logical Fallacies to avoid in Data Analysis." In this blog post, we focus on the Texas Sharpshooter Fallacy and how to avoid it in your data analysis.
“Lies, damned lies and statistics” is the frequently quoted adage attributed to former British Prime Minister Benjamin Disraeli. The manipulation of data to fit a narrative is a very common occurrence from politics, economics to business and beyond.
We've covered a few fundamentals and pitfalls of data analytics in our past blog posts. In this blog post, we focus on the four types of data analytics we encounter in data science: Descriptive, Diagnostic, Predictive and Prescriptive. Note: This blog post was published on the KDNuggets blog - Data Analytics and Machine Learning blog - in July 2017 and received the most reads and shares by their readers that month.
If I was to sum up our purpose at Principa, it would be “to help clients make informed decisions using data, analytics and software”. As information grows, so the opportunity to make better decisions increases. Data helps you understand your customer better. That’s our mantra. That’s our ethos. That’s why we are.
We take pride in our ability to predict - from the results of the 2015 Rugby World Cup and the 2016 Oscars to predicting profitable customers and customer churn. However, there is no denying that 2016 was a year full of shocking, unexpected events - from Brexit and the US election results to the acrimonious break-up of "Brangelina" (shocking!) and the sad loss of some very talented artists.
Predictive Analytics can yield amazing results. The lift that can be achieved by basing future decisions from observed patterns in historical events can far outweigh anything that can be achieved by relying on gut-feel or being guided by anecdotal events. There are numerous examples that demonstrate the possible lift that can be achieved across all possible industries, but a test we did recently in the retail sector showed that applying stable predictive models gave us a five-fold increase in the take-up of the product when compared against a random sample. Let’s face it, there would not be so much focus on Predictive Analytics and in particular Machine Learning if it was not yielding impressive results.
Hands up who has not heard of R? If you are in the data analytics space and have an internet connection then you would have heard of the open source programming language for predictive analytics and statistical computing that has taken the analytics world by storm.
Thanks to its broad applicability, data analytics has rapidly become a critical business function for modern organisations. But with expertise in the field in short supply and high demand, companies with an identified need for data analytics are looking beyond their traditional borders to monetise their information assets. Forrester Research predicts that a third of businesses will “pursue data science through outsourcing and technology” as organisations become less process-driven and look to their data to find new opportunities for innovation. And with globalisation and technological advancements making outsourcing a realistic and practical option for businesses, this trend is set to gain momentum. With this in mind, let's take a look at why an organisation would even consider outsourcing their analytics capabilities in the first place.
Capacity management describes a company's ability to meet present and future demands for its products and services. This involves a wide set of roles, responsibilities, processes and functions that all depend on their successful execution and interplay between one another. Although the working parts are many, the end goal behind capacity management is a shared one: to beat the competition in delivering the best products and services to the customer.
Telematics is a relatively new field, combining telecommunications and vehicular technologies with the insights of information processing. While the term was coined in 1978 in a French government report on the rapid development of computer technology, it now nearly exclusively refers to the technology that tracks vehicles in real time using GPS. Through this technology, telematics companies have access to a large amount of data that, with the help of data analytics, can be extremely useful.
It’s been almost 15 years since we saw the future of crime prevention in “Minority Report” – but today, we are beginning to see those then fictitious yet fantastical methods of predicting and preventing crime being implemented in various parts of the world. I’ll briefly mention three examples below of how analytics is already being used to prevent crime today before going into more detail on a fourth example: using analytics to prevent a criminal from re-offending.
Corporate silos may have been necessary in a burgeoning industrial era some decades ago, but times have changed. The era of social, mobile, analytics and cloud (SMAC) technologies has been fuelling a new wave of business transformation. Virtually every industry is being affected by the SMAC phenomenon and we’re currently only scraping the surface of what is possible.
I fondly remember watching as a child “The Roadrunner Show” cartoon series where the coyote (Wile E. Coyote) was devising elaborate schemes to try and catch the roadrunner. Although his schemes appeared to be clever and creative (to a six year old at least), he always failed. So, in one of the episodes he ordered a giant mainframe type super computer (this was the 1950s after all) to apply “data science” to devise a more effective scheme. This time his prey was Bugs Bunny.
We take a look at the Top Ten blog posts which received the greatest number of views in 2015.
In my previous post, we looked at the first “D” of the 3D approach of identifying and extracting value out of your transaction data: Determination. If you recall, I proposed a 3 step approach (the 3D approach) to realising value from a variety of large data sets: Determination –scour data sources to establish if and where there might be value Development – create models that will be developed for the decision areas where value was identified with data that is predictive of this predetermined outcome Deployment – implement and run the developed market
With the banking and credit industries still somewhat coping with paper and process-heavy cultures of yesteryear, innovation is fast becoming a key differentiating factor in attracting a new generation of customers. Thankfully, we live in an age where more can be done with relatively little effort thanks to the automation capabilities and extended reach technology provides us. This is why the mobile device is fast becoming the focal point of how people work, communicate, find entertainment, get informed or even transact.
In the first part of our series on "Finding Value in Transaction Data" we explored a problem that is encountered by many organisations – how to identify and extract value out of the ever growing amount of data.
Data analytics is certainly making its impact felt in our collective progress as a species, with the technology being applied across a wide range of human activity. A report by the United Nations entitled Humanitarianism in the Interconnected Age identifies four challenges surrounding data analytics in helping tackle global challenges such as access to education, natural disaster management and disease control and prevention. These requirements centre on the development, acquisition, analysis and sharing of the increasing numbers of new information channels to find solutions to global challenges. In this blog, we'll look at three ways data analytics is helping save lives by allowing us to understand, anticipate and manage events or situations that pose a threat to human life.
The once flat region South of Johannesburg is littered with man-made yellow hillocks. These hills are a reminder of the history of Johannesburg – a city literally built on gold. The gold has been mined here since the rush of the 1880s. Much of the gold was extracted from the crushed rock and the left-overs transported by mules and later trucks to these locations. However, the mine-dumps (as they’re locally known) are yellow-coloured as a reminder that even after the original extract – allusive gold still exists in the waste piles.
Last month we used predictive analytics and machine learning to predict the results of the Rugby World Cup, “out-moneyballing” the bookies themselves and placing us at the top 00.32% of humans on sports prediction site, SuperBru. Now that the dust has settled a bit after that fun initiative, I thought I’d look into other ways data analytics is being used in the sports world today. There are indeed many ways, but for the sake of brevity, let’s look at four of the more interesting ways that data analytics is changing the world of sports.
It seems to be a topic of conversation everywhere you go, and now the Internet of things (IoT) and a growing data-sharing culture is helping make the world a safer place - one missing drain cover and pothole at a time.
Kevin Spacey may not have given data analytics the nod at his acceptance speech this year at the Golden Globes, but that doesn’t mean the star doesn’t understand the depth of data’s role in the success of his hit show, House of Cards.