The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

10 Machine Learning Books To Read For Budding Data Scientists

September 20, 2018 at 8:06 AM

Machine learning and artificial intelligence are exciting fields, and we've been writing about these topics for a couple of years now. While a lot of what we talk about on our blog is advanced implementations of machine learning and can be overwhelming to beginners, the core concepts of machine learning are actually pretty easy to grasp. There are many resources and cheat sheets available online, but we believe the old fashioned way of learning is sometimes the best: with a good book. Few resources can match the in-depth, comprehensive detail of a good book.

In this blog, we list some of the most popular books for machine learning beginners or for anyone curious about machine learning.

But while these books will give you a good overview of the topics and theories, you also can’t beat practice. Check out our blog on the courses available online or in South Africa to add some practical experience and coursework to your machine learning learning curve.

Machine learning books for beginners

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition

Author(s): Trevor Hastie, Robert Tibshirani and Jerome Friedman

This book describes the critical ideas in a variety of fields such as medicine, biology, finance, and marketing in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with liberal use of colour graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting - the first comprehensive treatment of this topic in any book.

4.0 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Elements-Statistical-Learning-Prediction-Statistics/dp/0387848576/

Machine Learning for Absolute Beginners: A Plain English Introduction

Author(s): Oliver Theobald

Machine Learning for Absolute Beginners Second Edition has been written and designed for absolute beginners. This means understandable English explanations and no coding experience required. Where core algorithms are introduced, clear explanations and visual examples are added to make it easy and engaging to follow along at home.

4.5 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Machine-Learning-Absolute-Beginners-Introduction/dp/1549617214/

Machine Learning: The Absolute Beginner’s Guide to Learn and Understand Machine Learning Effectively

Author(s): Hein Smith

Just about anyone with the slightest bit of interest in modern technology is looking to learn more about Machine Learning. This innovative new form of computer programming is the primary tool that makes it possible for a machine to perform a wide range of tasks for you that could range from recommending an excellent movie to driving you to work every day.

No doubt, it is the tech of the future. But it is also a subject that can literally boggle the mind. If you’re not already deep into the terminology and techniques of this wildly exciting new industry, finding information on it written in basic layman’s terms can be tough.

Most of the books on the topic assume that you have at least a fundamental knowledge of the subject. If you’re interested in getting a better grasp at just how this new technology works and what it means for the masses then this is the book for you.

All of it is in very basic simple English so you won't need a coding degree to understand it. Here, we discuss all the essential entry-level topics required for the absolute amateur so you can start to make sense of this highly innovative technological advancement.

Machine Learning is becoming an increasingly powerful tool that will have an impact on every aspect of our lives in the future. So, whether you need to find good product recommendations to meet your needs or you want to go all out and live in your own smart home, machine learning will be at the core of it. This book will make it easier to grasp the concepts behind it and get you started on a path that leads to a very bright future.

4.5 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Machine-Learning-Beginners-Understand-Effectively-ebook/dp/B07F5H8BPL/

Understanding Machine Learning: From Theory to Algorithms

Author(s): Shai Shalev-Shwartz

Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. This textbook aims to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides an extensive theoretical account of the fundamental ideas underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics of the field, the book covers a wide array of central topics that have not been addressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; major algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds. Designed for an advanced undergraduate or beginning graduate course, the text makes the fundamentals and algorithms of machine learning accessible to students and non-expert readers in statistics, computer science, mathematics, and engineering.

4.4 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Understanding-Machine-Learning-Theory-Algorithms/dp/1107057132/

Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies

Author(s): John D. Kelleher, Brain Mac Namee and Aoife D’Arcy

A comprehensive introduction to the most essential machine learning approaches used in predictive data analytics, covering both theoretical concepts and practical applications.

Machine learning is often used to build predictive models by extracting patterns from large datasets. These models are used in predictive data analytics applications including price prediction, risk assessment, predicting customer behaviour, and document classification. This introductory textbook offers a detailed and focused treatment of the most crucial machine learning approaches used in predictive data analytics, covering both theoretical concepts and practical applications. The technical and mathematical material is augmented with illustrative worked examples, and case studies illustrate the use of these models in the broader business context.

After discussing the trajectory from data to insight to decision, the book describes four approaches to machine learning: information-based learning, similarity-based learning, probability-based learning, and error-based learning. Each of these approaches is introduced by a nontechnical explanation of the underlying concept, followed by mathematical models and algorithms illustrated by detailed worked examples. Finally, the book considers techniques for evaluating prediction models and offers two case studies that describe specific data analytics projects through each phase of development, from formulating the business problem to implementation of the analytics solution. The book, informed by the authors' many years of teaching machine learning, and working on predictive data analytics projects, is suitable for use by undergraduates in computer science, engineering, mathematics, or statistics; by graduate students in disciplines with applications for predictive data analytics; and as a reference for professionals.

4.5 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Fundamentals-Machine-Learning-Predictive-Analytics/dp/0262029448/

Introduction to Machine Learning with Python: A Guide for Data Scientists

Author(s); Andreas C. Muller and Sarah Guido

Machine learning has become an integral part of many commercial applications and research projects, but this field is not exclusive to large companies with extensive research teams. If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.

You’ll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Müller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.

4.1 /5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Introduction-Machine-Learning-Python-Scientists/dp/1449369413

Machine Learning with R – Second Edition: Expert techniques for predictive modelling to solve all your data analysis problems

Author(s): Brett Lantz

Machine learning, at its core, is concerned with transforming data into actionable knowledge. This makes machine learning well suited to the present-day era of big data. Given the growing prominence of R's cross-platform, zero-cost statistical programming environment, there has never been a better time to start applying machine learning to your data. Machine learning with R offers a robust set of methods to quickly and easily gain insight from your data to both, veterans and beginners in data analytics.

Want to turn your data into actionable knowledge, predict outcomes that make a real impact, and have continued developing insights? R gives you access to all the power you need to master exceptional machine learning techniques.

The second edition of Machine Learning with R provides you with an introduction to the essential skills required in data science. Without shying away from technical theory, it is written to provide focused and practical knowledge to get you building algorithms and crunching your data, with minimal previous experience.

With this book, you'll discover all the analytical tools you need to gain insights from complex data and learn to choose the correct algorithm for your specific needs. Through full engagement with the sort of real-world problems data-wranglers face, you'll learn to apply machine learning methods to deal with common tasks, including classification, prediction, forecasting, market analysis, and clustering. Transform the way you think about data; discover machine learning with R.

4.5 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Machine-Learning-techniques-predictive-modeling/dp/1784393908

More advanced machine learning books to discover

The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake our World

Author(s): Pedro Domingos

In the world's top research labs and universities, the race is on to invent the ultimate learning algorithm: one capable of discovering any knowledge from data, and doing anything we want, before we even ask. In The Master Algorithm, Pedro Domingos lifts the veil to give us a peek inside the learning machines that power Google, Amazon, and your smartphone. He assembles a blueprint for the future universal learner--the Master Algorithm--and discusses what it will mean for business, science, and society. If data-ism is today's philosophy, this book is its bible.

4.2 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Master-Algorithm-Ultimate-Learning-Machine/dp/0465094279/

Advances in Financial Machine Learning

Author(s): Marcos Lopez de Prado

Machine learning (ML) is changing virtually every aspect of our lives. Today ML algorithms accomplish tasks that until recently only expert humans could perform. As it relates to finance, this is the most exciting time to adopt a disruptive technology that will transform how everyone invests for generations. Readers will learn how to structure Big data in a way that is amenable to ML algorithms; how to conduct research with ML algorithms on that data; how to use supercomputing methods; how to backtest your discoveries while avoiding false positives. The book addresses real-life problems faced by practitioners on a daily basis and explains scientifically sound solutions using math, supported by code and examples. Readers become active users who can test the proposed solutions in their particular setting. Written by a recognised expert and portfolio manager, this book will equip investment professionals with the ground-breaking tools needed to succeed in modern finance.

4.5 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Advances-Financial-Machine-Learning-Marcos/dp/1119482089/

Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms

Author(s): Nikhil Buduma and Nicholas Locascio

With the reinvigoration of neural networks in the 2000s, deep learning has become an extremely active area of research, one that’s paving the way for modern machine learning. In this practical book, author Nikhil Buduma provides examples and clear explanations to guide you through major concepts of this complicated field.

Companies such as Google, Microsoft, and Facebook are actively growing in-house deep-learning teams. For the rest of us, however, deep learning is still a pretty complex and difficult subject to grasp. If you’re familiar with Python, and have a background in calculus, along with a basic understanding of machine learning, this book will get you started.

3.7 / 5 stars on Amazon (at the time of writing).

Find it on Amazon: https://www.amazon.com/Fundamentals-Deep-Learning-Next-Generation-Intelligence/dp/1491925612/

collection of data science and machine learning resources

Latest Posts

The 7 types of credit risk in SME lending

  It is common knowledge in the industry that the credit risk assessment of a consumer applying for credit is far less complex than that of a business that is applying for credit. Why is this the case? Simply put, consumers are usually very similar in their requirements and risks (homogenous) whilst businesses have far more varying risk elements (heterogenous). In this blog we will look at all the different risk elements within a business (here SME) credit application. These are: Risk of proprietors Risk of business Reason for loan Financial ratios Size of loan Risk industry Risk of region Before we delve into this list, it is worth noting that all of these factors need to be deployable as assessment tools within your originations system so it is key that you ensure your system can manage them. If you are on the look out for a loans origination system, then look no further than Principa’s AppSmart. If you are looking for a decision engine to manage your scorecards, policy rules and terms of business then take a look at our DecisionSmart business rules engine. AppSmart and DecisionSmart are part of Principa’s FinSmart Universe allowing for effective credit management across the customer life-cycle.   The different risk elements within a business credit application 1) Risk of proprietors For smaller organisations the risk of the business is inextricably linked to the financial well-being of the proprietors. How small is small? The rule of thumb is companies with up to two to three proprietors should have their proprietors assessed for risk too. This fits in with the SME segment. What data should be looked at? Generally in countries with mature credit bureaux, credit data is looked at including the score (there is normally a score cut-off) and then negative information such as the existence of judgements or defaults; these are typically used within policy rules. Those businesses with proprietors with excessive numbers of “negatives” may be disqualified from the loan application. Some credit bureaux offer a score of an individual based on the performance of all the businesses with which they are associated. This can also be useful in the credit risk assessment process. Another innovation being adopted internationally is the use of psychometrics in credit evaluation of the proprietors. To find out more about adopting credit scoring, read our blog on how to adopt credit scoring.   2) Risk of business The risk of the business should be managed through both scores and policy rules. Lenders will look at information such as the age of company, the experience of directors and the size of company etc. within a score. Alternatively, many lenders utilise the business score offered by credit bureaux. These scores are typically not as strong as consumer scores as the underlying data is limited and sometimes problematic. For example, large successful organisations may have judgements registered against their name which, unlike for consumers, is not necessarily a direct indication of the inability to service debt.   3) Reason for loan The reason for a loan is used more widely in business lending as opposed to unsecured consumer lending. Venture capital, working capital, invoice discounting and bridging finance are just some of many types of loan/facilities available and lenders need to equip themselves with the ability to manage each of these customer types whether it is within originations or collections. Prudent lenders venturing into the SME space for the first time often focus on one or two of these loan types and then expand later – as the operational implication for each type of loan is complex.   4) Financial ratios Financial ratios are core to commercial credit risk assessment. The main challenge here is to ensure that reliable financials are available from the customer. Small businesses may not be audited and thus the financials may be less trustworthy. Financial ratios can be divided into four categories: Profitability Leverage Coverage Liquidity Profitability can be further divided into margin ratios and return ratios. Lenders are frequently interested in gross profit margins; this is normally explicit on the income statement. The EBIDTA margin and operating profit margins are also used as well as return ratios such as return on assets, return on equity and risk-adjusted-returns. Leverage ratios are useful to lenders as they reflect the portion of the business that is financed by debt. Lower leverage ratios indicate stability. Leverage ratios assessed often incorporate debt-to-asset, debt-to-equity and asset-to-equity. Coverage ratios indicate the coverage that income or assets provide for the servicing of debt or interest expenses. The higher the coverage ratio the better it is for the lender. Coverage ratios are worked out considering the loan/facility that is being applied for. Finally, liquidity ratios indicate the ability for a company to convert its assets into cash. There are a variety of ratios used here. The current ratio is simply the ratio of assets to liabilities. The quick ratio is the ability for the business to pay its current debts off with readily available assets. The higher the liquidity ratios the better. Ratios are used both within credit scorecards as well as within policy rules. You can read more about these ratios here.   5) Size of loan When assessing credit risk for a consumer, the risk of the consumer does not normally change with the change of loan amount or facility (subject to the consumer passing affordability criteria). With business loans, loan amounts can range quite dramatically, and the risk of the applicant is normally tied to the loan amount requested. The loan/facility amount will of course change the ratios (mentioned in the last section) which could affect a positive/negative outcome. The outcome of the loan application is usually directly linked to a loan amount and any marked change to this loan amount would change the risk profile of the application.   6) Risk of industry The risk of an industry in which the SME operates can have a strong deterministic relationship with the entity being able to service the debt. Some lenders use this and those who do not normally identify this as a missing element in their risk assessment process. The identification of industry is always important. If you are in manufacturing, but your clients are the mines, then you are perhaps better identified as operating in mining as opposed to manufacturing. Most lenders who assess industry, will periodically rule out certain industries and perhaps also incorporate industry within their scorecard. Others take a more scientific approach. In the graph below the performance of an industry is tracked for two years and then projected over the next 6 months; this is then compared to the country’s GDP. As the industry appears to track above the projected GDP, a positive outlook is given to this applicant and this may affect them favourably in the credit application.                   7) Risk of Region   The last area of assessment is risk of region. Of the seven, this one is used the least. Here businesses,  either on book or on the bureau, are assessed against their geo-code. Each geo-code is clustered, and the projected outlook is given as positive, static or negative. As with industry this can be used within the assessment process as a policy rule or within a scorecard.   Bringing the seven risk categories together in a risk assessment These seven risk assessment categories are all important in the risk assessment process. How you bring it all together is critical. If you would like to discuss your SME evaluation challenges or find out more about what we offer in credit management software (like AppSmart and DecisionSmart), get in touch with us here.

Collections Resilience post COVID-19 - part 2

Principa Decisions (Pty) L

Collections Resilience post COVID-19

Principa Decisions (Pty) L