The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

Where To Learn Essential Data Science Skills Online

September 6, 2018 at 8:07 AM

The value and benefits of becoming a data scientist or picking up basic data science skills, cannot be overstated in today’s world. Businesses across all industries are starting to embrace data analytics and those who aren’t will soon feel the advantage gained by their competitors who are.

The “sexiest job of the 21st century” and the “hottest job of the decade” is the fastest growing field in technology at the moment and it offers the prospect of a very well-paid career in an innovative area.

In this blog, we’ve created a (non-exhaustive) list of online courses you should consider if you want to learn essential data science skills. We’ll also be putting together a list of the classroom, in-person, on-site courses and certifications available in South Africa, so check back here soon if you’d be more interested in attending physical classes.

Please note: Principa has recently launched an eLearning platform for corporate training in collections, credit and general financial skills and best practices, called Wisdome. We do not currently offer training in data analytics or machine learning, but always welcome applications to work at Principa. View our careers page for more. 

Online data science courses available

Simplilearn’s Data Science Certification Training – R Programming

Course Description: Become an expert in data analytics using the R programming language in this data science training course. You'll master data exploration, data visualisation, predictive analytics and descriptive analytics techniques with the R language. With this data science course, you'll get hands-on practice on R CloudLab by implementing various real-life, industry-based projects in the domains of healthcare, retail, insurance, finance, airlines, music industry, and unemployment.

Duration: 40 hours

Cost: $399 for the self-paced learning option; $799 for online classroom Flexi-pass training option

Prerequisites: None

Learn more: https://www.simplilearn.com/big-data-and-analytics/data-scientist-certification-sas-r-excel-training

Coursera's Data-Driven Decision Making Course

Course Description: In this course, you'll get an introduction to Data Analytics and its role in business decisions. You'll learn why data is important and how it has evolved. You'll explore "Big Data" and how to use it. You'll also get an introduction to a framework for conducting Data Analysis and what tools and techniques are used commonly. Finally, you'll have a chance to put your knowledge to work in a simulated business setting.

Duration: 4 weeks

Cost: $49 per month gets you access to this & 4 other courses as part of the Data Analysis and Presentation Skills: the PwC Approach Specialisation

Prerequisites: None.

Learn more: https://www.coursera.org/learn/decision-making

Coursera’s Data Science Specialisation

Course Description: This Specialisation covers the concepts and tools you'll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students have a portfolio demonstrating their mastery of the material.

Duration: This specialisation consists of 10 courses with 9 courses taking place over 4 weeks and the 10th taking place over 7 weeks. It's recommended to start off with the first and then the second course in the specialisation, but the remainder can be completed in any order or parallel with each other. On average, the specialisation duration is between 3 and 6 months. 

Cost: $49 per month for all access to all 10 courses in the specialisation

Prerequisites: None

Learn more: https://www.coursera.org/specializations/jhu-data-science

Coursera’s Machine Learning Specialisation

Course Description: his Specialization from leading researchers at the University of Washington introduces you to the exciting, high-demand field of Machine Learning. Through a series of practical case studies, you will gain applied experience in major areas of Machine Learning including Prediction, Classification, Clustering, and Information Retrieval. You will learn to analyse large and complex datasets, create systems that adapt and improve over time, and build intelligent applications that can make predictions from data.

Duration: There are 4 courses in this specialisation, with the 1st, 2nd and 4th courses duration being 6 weeks and the 3rd course 7 weeks. On average, it takes 8 months to complete the specialisation.

Cost: $49 per month for all access to all 4 courses in the specialisation

Prerequisites: Basic math and Python programming skills

Learn more: https://www.coursera.org/specializations/machine-learning

Udacity’s Intro to Data Science

Course description: The Introduction to Data Science class will survey the foundational topics in data science, namely:

  • Data Manipulation
  • Data Analysis with Statistics and Machine Learning
  • Data Communication with Information Visualization
  • Data at Scale -- Working with Big Data

The class will focus on breadth and present the topics briefly instead of focusing on a single topic in depth. This will give you the opportunity to sample and apply the basic techniques of data science.

Duration: Approximately 2 months

Cost: Free

Prerequisites: Interest in data science, background in basic statistics, Python programming skills or basic understanding of programming concepts.

Learn more: https://www.udacity.com/course/intro-to-data-science--ud359

Udacity’s Intro to Machine Learning

Course Description: This is a class that will teach you the end-to-end process of investigating data through a machine learning lens. It will teach you how to extract and identify useful features that best represent your data, a few of the essential machine learning algorithms, and how to evaluate the performance of your machine learning algorithms.

Duration: 10 weeks on average

Cost: Free

Prerequisites: Proficiency in programming in Python and knowledge of basic statistics.

Learn more: https://www.udacity.com/course/intro-to-machine-learning--ud120

edX’s Data Science Essentials Course

Course Description: In this data science course, you will learn key concepts in data acquisition, preparation, exploration, and visualisation taught alongside practical application-oriented examples such as how to build a cloud data science solution using Microsoft Azure Machine Learning platform, or with R, and Python on Azure stack.

Duration: 6 weeks

Cost: Free, verified certificate for $99

Prerequisites: Basic maths, basic knowledge of R or Python

Learn more: https://www.edx.org/course/data-science-essentials

edX’s The Analytics Edge Course

Course Description: In this course, you will learn how to use data and analytics to give an edge to your career and your life. We will examine real-world examples of how analytics have been used to improve a business or industry significantly. These examples include Moneyball, eHarmony, the Framingham Heart Study, Twitter, IBM Watson, and Netflix. Through these examples and many more, we will teach you the following analytics methods: linear regression, logistic regression, trees, text analytics, clustering, visualisation, and optimisation. We will be using the statistical software R to build models and work with data.

Duration: 13 weeks

Cost: Free, verified certificate for $150

Prerequisites: Basic mathematical knowledge (at a high school level). You should be familiar with concepts like mean, standard deviation, and scatterplots. Mathematical maturity and prior experience with programming will decrease the estimated effort required for the class but are not necessary to succeed.

Learn more: https://www.edx.org/course/the-analytics-edge

Udemy’s Introduction to Data Science

Course Description: This course covers the necessary tools and concepts used in the data science industry, including machine learning, statistical inference, working with data at scale and much more.

Duration: 3 hours

Cost: R200

Prerequisites: Basic understanding of R

Learn more: https://www.udemy.com/learn-data-science

Future Learn’s Introduction to R for Data Science

Course Description: This course will use airline data to demonstrate key concepts involved in the analysis of big data. In this course, you will learn how to use the R platform to manage data. The course serves as an introduction to the R software. It lays the foundation for anyone to begin studying data science and its applications, or to prepare learners to take more advanced courses related to data science, such as machine learning and computational statistics.

Duration: 4 weeks

Cost: Free, $84 for a certificate and some other benefits

Prerequisites: None

Learn more: https://www.futurelearn.com/courses/data-science

collection of data science and machine learning resources

Latest Posts

The 7 types of credit risk in SME lending

  It is common knowledge in the industry that the credit risk assessment of a consumer applying for credit is far less complex than that of a business that is applying for credit. Why is this the case? Simply put, consumers are usually very similar in their requirements and risks (homogenous) whilst businesses have far more varying risk elements (heterogenous). In this blog we will look at all the different risk elements within a business (here SME) credit application. These are: Risk of proprietors Risk of business Reason for loan Financial ratios Size of loan Risk industry Risk of region Before we delve into this list, it is worth noting that all of these factors need to be deployable as assessment tools within your originations system so it is key that you ensure your system can manage them. If you are on the look out for a loans origination system, then look no further than Principa’s AppSmart. If you are looking for a decision engine to manage your scorecards, policy rules and terms of business then take a look at our DecisionSmart business rules engine. AppSmart and DecisionSmart are part of Principa’s FinSmart Universe allowing for effective credit management across the customer life-cycle.  The different risk elements within a business credit application 1) Risk of proprietors For smaller organisations the risk of the business is inextricably linked to the financial well-being of the proprietors. How small is small? The rule of thumb is companies with up to two to three proprietors should have their proprietors assessed for risk too. This fits in with the SME segment. What data should be looked at? Generally in countries with mature credit bureaux, credit data is looked at including the score (there is normally a score cut-off) and then negative information such as the existence of judgements or defaults; these are typically used within policy rules. Those businesses with proprietors with excessive numbers of “negatives” may be disqualified from the loan application. Some credit bureaux offer a score of an individual based on the performance of all the businesses with which they are associated. This can also be useful in the credit risk assessment process. Another innovation being adopted internationally is the use of psychometrics in credit evaluation of the proprietors. To find out more about adopting credit scoring, read our blog on how to adopt credit scoring.   2) Risk of business The risk of the business should be managed through both scores and policy rules. Lenders will look at information such as the age of company, the experience of directors and the size of company etc. within a score. Alternatively, many lenders utilise the business score offered by credit bureaux. These scores are typically not as strong as consumer scores as the underlying data is limited and sometimes problematic. For example, large successful organisations may have judgements registered against their name which, unlike for consumers, is not necessarily a direct indication of the inability to service debt.   3) Reason for loan The reason for a loan is used more widely in business lending as opposed to unsecured consumer lending. Venture capital, working capital, invoice discounting and bridging finance are just some of many types of loan/facilities available and lenders need to equip themselves with the ability to manage each of these customer types whether it is within originations or collections. Prudent lenders venturing into the SME space for the first time often focus on one or two of these loan types and then expand later – as the operational implication for each type of loan is complex. 4) Financial ratios Financial ratios are core to commercial credit risk assessment. The main challenge here is to ensure that reliable financials are available from the customer. Small businesses may not be audited and thus the financials may be less trustworthy.   Financial ratios can be divided into four categories: Profitability Leverage Coverage Liquidity Profitability can be further divided into margin ratios and return ratios. Lenders are frequently interested in gross profit margins; this is normally explicit on the income statement. The EBIDTA margin and operating profit margins are also used as well as return ratios such as return on assets, return on equity and risk-adjusted-returns. Leverage ratios are useful to lenders as they reflect the portion of the business that is financed by debt. Lower leverage ratios indicate stability. Leverage ratios assessed often incorporate debt-to-asset, debt-to-equity and asset-to-equity. Coverage ratios indicate the coverage that income or assets provide for the servicing of debt or interest expenses. The higher the coverage ratio the better it is for the lender. Coverage ratios are worked out considering the loan/facility that is being applied for. Finally, liquidity ratios indicate the ability for a company to convert its assets into cash. There are a variety of ratios used here. The current ratio is simply the ratio of assets to liabilities. The quick ratio is the ability for the business to pay its current debts off with readily available assets. The higher the liquidity ratios the better. Ratios are used both within credit scorecards as well as within policy rules. You can read more about these ratios here. 5) Size of loan When assessing credit risk for a consumer, the risk of the consumer does not normally change with the change of loan amount or facility (subject to the consumer passing affordability criteria). With business loans, loan amounts can range quite dramatically, and the risk of the applicant is normally tied to the loan amount requested. The loan/facility amount will of course change the ratios (mentioned in the last section) which could affect a positive/negative outcome. The outcome of the loan application is usually directly linked to a loan amount and any marked change to this loan amount would change the risk profile of the application.   6) Risk of industry The risk of an industry in which the SME operates can have a strong deterministic relationship with the entity being able to service the debt. Some lenders use this and those who do not normally identify this as a missing element in their risk assessment process. The identification of industry is always important. If you are in manufacturing, but your clients are the mines, then you are perhaps better identified as operating in mining as opposed to manufacturing. Most lenders who assess industry, will periodically rule out certain industries and perhaps also incorporate industry within their scorecard. Others take a more scientific approach. In the graph below the performance of an industry is tracked for two years and then projected over the next 6 months; this is then compared to the country’s GDP. As the industry appears to track above the projected GDP, a positive outlook is given to this applicant and this may affect them favourably in the credit application.                   7) Risk of Region   The last area of assessment is risk of region. Of the seven, this one is used the least. Here businesses,  either on book or on the bureau, are assessed against their geo-code. Each geo-code is clustered, and the projected outlook is given as positive, static or negative. As with industry this can be used within the assessment process as a policy rule or within a scorecard.   Bringing the seven risk categories together in a risk assessment These seven risk assessment categories are all important in the risk assessment process. How you bring it all together is critical. If you would like to discuss your SME evaluation challenges or find out more about what we offer in credit management software (like AppSmart and DecisionSmart), get in touch with us here.

Collections Resilience post COVID-19 - part 2

Principa Decisions (Pty) L

Collections Resilience post COVID-19

Principa Decisions (Pty) L