The Data Analytics Blog

Our news and views relating to Data Analytics, Big Data, Machine Learning, and the world of Credit.

All Posts

10 People You Should Take Note Of In The Machine Learning Industry

October 3, 2018 at 8:03 AM

We’ve compiled this list of 10 people you should take note of in the machine learning field to keep yourself updated and informed about the field. It's also useful if you're interested in learning about machine learning, as these are the people who not only influence the industry but are paving the way forward and shaping the future of machine learning. We've given a short overview and a video to familiarise yourself with each of them, but if you're invested in learning about machine learning, follow them on social and start reading their publications.

Top influencers in the machine learning industry

Geoffrey Hinton

Geoffrey Hinton, an English Canadian cognitive psychologist and computer scientist, works for Google, as a Vice President and Engineering Fellow, the University of Toronto, as an Emeritus Distinguished Professor and the Vector Institute, as Chief Scientific Adviser.

He’s made this list, as he’s often referred to as the “Godfather of Deep Learning”. He was one of the first researchers in the field of neural networks after receiving his PhD in Artificial Intelligence in 1978. Hinton was one of the researchers who introduced the backpropagation algorithm and the first to use backpropagation for learning word embeddings. His other contributions to neural network research include Boltzmann machines, distributed representations, time-delay neural nets, mixtures of experts, variational learning and deep learning.  His research group in Toronto made significant breakthroughs in deep learning that revolutionised speech recognition and object classification.

He aims to discover a learning procedure that is efficient at finding complex structure in large, high-dimensional datasets and to show that this is how the brain learns to see.


Ian Goodfellow


Ian Goodfellow is a research scientist at Google Brain who’s played a significant role in the field of deep learning. He’s best known for inventing GANs (Generative Adversarial Networks), and he’s also the author of a textbook Deep Learning.

He studied computer science at Stanford and got his PhD in machine learning from the Université de Montréal (supervised by Yoshua Bengio, who will be discussed later). At Google, he developed a system enabling Google Maps to automatically transcribe addresses from photos taken by Street View cars and demonstrated security vulnerabilities of machine learning systems. His research interests include most deep learning topics, especially generative models and machine learning security and privacy.

He was named as one of the 35 Innovators under 35 by the MIT Technology Review in 2017.


Andrew Ng


Andrew Ng, a Chinese English computer scientist and entrepreneur, co-founded and led Google Brain, was a former Vice President and Chief Scientist at Baidu, is an adjunct professor at Stanford University and is a co-founder of Coursera (along with Daphne Koller, who is on the list as well).

After studying computer science at Carnegie Mellon, Ng earned his master’s from Massachusetts and PhD from Berkeley. On top of his contributions at Google, Baidu, Stanford and Coursera, he’s also launched, an online curriculum of classes, and, which brings AI to manufacturing factories. Most recently, Ng has announced the AI Fund, raising $175 million to invest in new start-ups. He’s also the chairman of Woebot and on the board of He distributes his book Machine Learning Yearning, a practical guide for those interested in machine learning, for free.


David Kenny


David Kenny is the Senior Vice President of IBM's Watson & Cloud platform. He was formerly the CEO of The Weather Company, which was acquired by IBM in 2016. He received his MBA from the Harvard School of Business.

While not a computer science, like the others on this list, he’s nevertheless one to watch in machine learning. As the leader of IBM’s Watson, he’s the one who’ll be bringing the machine learning an AI to market and ensuring the business applications are useful. His mission is to bring the world’s most advanced enterprise artificial intelligence capabilities to industry and domain applications.  He oversees IBM’s AI platform and portfolio across multiple areas including health, security, talent, Internet of Things, engagement, and works to identify new areas of growth.


Daphne Koller


Daphne Koller, an Israeli-American, is an Adjunct Professor of Computer Science at Stanford University and a MacArthur Fellowship recipient.

Along with Andrew Ng, she's one of the co-founders of Coursera and served first as co-CEO and then as President. Her research focusses on artificial intelligence and its applications in the biomedical sciences.

Along with Dr Anna Penn of Stanford University, Koller developed PhysiScore, which uses various data elements to predict whether premature babies are likely to have health issues. After leaving Coursera, she joined Calico, a research and development biotech company backed by Google with the goal of combating ageing and associated diseases, as their Chief Computing Officer. She recently left Calico to start Insitro, a drug discovery startup, where she is now CEO.

Daphne Koller received her bachelor’s degree and her master’s degree from the Hebrew University of Jerusalem. She was recognised for her contributions to online education by being named one of Newsweek's 10 Most Important People in 2010, Time magazine's 100 Most Influential People in 2012, and Fast Company's Most Creative People in 2014.


Yoshua Bengio

Yoshua Bengio is a Canadian computer scientist, most known for his work on artificial neural networks and deep learning.

Along with Geoffrey Hinton and Yann Lecun, Bengio is considered one of the three people most responsible for the advancement of deep learning during the 1990s and 2000s. He received his Bachelor of Engineering, Master of Science and PhD in Computer Science from McGill University. He is a faculty member at the Université de Montréal, heads up the Montreal Institute for Learning Algorithms and is co-director of the Learning in Machine & Brain project of the Canadian Institute for Advanced Research.

Bengio also co-founded Element AI, a Montreal-based business incubator that seeks to transform AI research into real-world business applications. He holds a Canada Research Chair in Statistical Learning Algorithms, is Officer of the Order of Canada, recipient of the Marie-Victorin Quebec Prize 2017 and a Fellow of the Royal Society of Canada. His goal is to contribute to uncovering the principles giving rise to intelligence through learning, as well as favour the development of AI for the benefit of all.


Ilya Sutskever


Ilya Sutskever is a computer scientist specialising in machine learning and currently the Research Director of OpenAI, which he co-founded. He is a co-inventor of AlexNet, AlphaGo, TensorFlow, and Sequence to Sequence Learning.

He obtained his BSc, MSc and PhD in Computer Science from the University of Toronto under Geoffrey Hinton, after which he spent time as a postdoc with Andrew Ng at Stanford. He then returned to Toronto to co-found DNNResearch with Hinton, which was acquired by Google shortly after being created. Sutskever went to work for Google Brain as a research scientist. He recently left Google to start the OpenAI institute. He’s been named as one of the 35 Innovators under 35 by MIT Technology Review in 2015, and he was the keynote speaker at NVIDIA NTECH 2018 and AI Frontiers Conference 2018.


Andrej Karpathy


Andrej Karpathy is Director of AI and Autopilot Vision at Tesla. Before Tesla, he was a Research Scientist at OpenAI, working on Deep Learning in Computer Vision, Generative Modeling and Reinforcement Learning.

Karpathy received his BSc from Toronto, his MSc from the University of British Colombia and his PhD from Stanford, where he worked with Fei-Fei Li on convolutional/Recurrent Neural Network architectures and their applications in Computer Vision, Natural Language Processing and their intersection. During his PhD, he also did two internships at Google, where he worked on large-scale feature learning over YouTube videos, and in 2015 he interned at DeepMind and worked on Deep Reinforcement Learning. Together with Fei-Fei, he designed and taught a new Stanford class on Convolutional Neural Networks for Visual Recognition.



Yann LeCun


Yann LeCun, a French Computer Scientist, graduated from the Pierre and Marie Curie University with a PhD in Computer Science during which he proposed an early form of the back-propagation learning algorithm for neural networks leading to him being considered one of the founding fathers of convolution nets. After graduating, he joined Geoffrey Hinton as a postdoctoral research associate in Toronto. He is well-known for developing many new machine learning methods, such as a biologically inspired model of image recognition called Convolutional Neural Networks, the "Optimal Brain Damage" regularisation methods, and the Graph Transformer Networks method (similar to the conditional random field), which he applied to handwriting recognition and OCR. The bank check recognition system that he helped develop was widely deployed by NCR and other companies, reading over 10% of all the checks in the US in the late 1990s and early 2000s.

He's also one of the primary creators of DjVu image compression technology, used by many websites, notably the Internet Archive, to distribute scanned documents.

LeCun joined the New York University in 2003, where he has worked primarily on Energy-Based Models for supervised and unsupervised learning, feature learning for object recognition in Computer Vision, and mobile robotics, and where he became the founding director of the NYU centre for data science.

In 2013, LeCun became the first director of Facebook AI Research. LeCun is a member of the US National Academy of Engineering, the recipient of the 2014 IEEE Neural Network Pioneer Award and the 2015 PAMI Distinguished Researcher Award.


Nando de Freitas


Nando de Freitas is a Zimbabwean born Computer Scientist and a Professor at the University of Oxford. After completing his BSc and MSc at the University of Witwatersrand, he completed his PhD at Trinity College in Cambridge. He was a Professor at the University of British Colombia, before joining the Department of Computer Science at the University of Oxford, and also works of Google’s DeepMind. His goal is to develop new ideas, algorithms and mathematical models to extend the frontiers of science and technology to improve the quality of life of humans and their environment. His is also a search for knowledge and a desire to understand how brains work and intelligence in general. De Freitas is a Fellow of the Canadian Institute for Advanced Research and has been awarded the Charles A. McDowell Award for Excellence in Research.


collection of data science and machine learning resources

Latest Posts

2021: Pursuing the pockets of profit

During the COVID-19 crisis, the media has focused much on the weak economy and stressed South African consumers. Figures show an increase in unemployment and for those lucky to be employed, many suffered decreased earnings through salary cuts. All this points to a highly strained economic environment.

Are we entering a mortgage provision spiral?

The South African credit bureau TransUnion recently released data on the performance of various different products within the bureau in their ”Quarterly Overview of Consumer Credit Trends” for the third quarter of 2020. With the COVID-19 crisis, 2020 was characterised by a severe reduction in account originations and payment holidays in Q2 with a high increase in non-performing accounts in Q3 as payment holidays ceased and stressed consumers failed to pay their accounts. The table below illustrates how each product showed worse performance (in terms of accounts moving to 3 months or more delinquent) year-on-year in Q3. For more on how Principa can assist your business in credit scoring and IFRS9 Provision click here and here. The table typically follows payment hierarchical patterns with credit cards performing best, but also illustrates risk-appetite for each product with clothing, microloans and retail instalments all showing the worst performance. For the retailers the closure of stores in Q2 meant fewer new good accounts were washing through, so the bad performing books in Q3 are/were accentuated. What does stand out, however, is the performance of mortgages that suffered a 350-basis point slump year-on-year in Q3. This is off a low overall “bad rate” too. Will the mortgage books bounce back, or will we see ourselves enter a mortgage provision spiral as we did in 2008/09? “When the spiral begins the knock-on effects can be catastrophic with provisions taking a hard hit.” Provisions in mortgages are unlike other product classes in consumer credit. When the spiral begins the knock-on effects can be catastrophic with provisions taking a hard hit. Banks around the world valued their books very differently post 2009 compared to pre-2008. A certain South African retail bank’s mortgage book valuation dropped by over 90% due to the knock-on effect of a mortgage provision spiral. Now the property market has been subdued for some years (compared to the bullish period leading up to 2008) so we are not expecting a mortgage crisis, but it is possible that a spiral will affect mortgages significantly as we enter a bearish market. How does the spiral work? An increase in defaults loans will mean the banks will need to make a difficult choice on whether to show leniency on the defaulting customers or to take strong action with repossession being the ultimate act. An increase in defaults also typically means that the book is not aging as expected and that the Probability of Defaults (PDs) experienced are higher than expected. Increase in defaults typically leads to more repossessions. More repossessions will mean the bank is left with an increased amount of stock (properties) to sell. More stock will likely mean bigger haircuts (i.e. difference between the net selling price of the property and its value) as the market becomes a buyer’s market. More stock together with the fact that banks will tighten lending criteria, will push property prices down. Bigger haircuts will mean an increase in shortfalls (i.e. where the net-value received for a property is less than the outstanding balance of the mortgage). More shortfalls will mean fewer voluntary sales to avoid defaulting (in bullish markets, consumers in financial destress may be pushed to sell their property; they’d likely make a profit from the property thus incurring no shortfall). Lower house prices will also contribute to more shortfalls and this in-turn results in much higher loss-given-defaults (LGDs). Higher PDs and LGDs pushes up provisions dramatically. Fewer voluntary sales to avoid defaulting means more accounts will now default and the spiral continues. The difference between a bullish and bearish market is illustrated in the image below. Whether we enter a bearish market and endure a mortgage spiral will depend on defaults increasing (generally due to the stressed South African economy) and whether banks enforce an increased number of repossessions. Whether we enter a bearish market and endure a mortgage spiral will depend on defaults increasing (generally due to the stressed South African economy) and whether banks enforce an increased number of repossessions.   Performance bounce back for the retailers At Principa we work closely with many retailers and we are aware that for many of them, Q3 saw accounts accelerate to a 3+ arrears state, but thereafter the book improved somewhat (i.e. those who were already stressed – accelerated to default – an inevitable ultimate state for some. On the other hand, the survivors are those resilient to the economic woes and continued to perform well; new accounts are also open). We look forward to establishing whether the same is true for mortgages when the performance figures are released for Q4. For more on how Principa can assist your business in credit scoring and IFRS9 Provision click on the links here and here or email us at

The 7 types of credit risk in SME lending

  It is common knowledge in the industry that the credit risk assessment of a consumer applying for credit is far less complex than that of a business that is applying for credit. Why is this the case? Simply put, consumers are usually very similar in their requirements and risks (homogenous) whilst businesses have far more varying risk elements (heterogenous). In this blog we will look at all the different risk elements within a business (here SME) credit application. These are: Risk of proprietors Risk of business Reason for loan Financial ratios Size of loan Risk industry Risk of region Before we delve into this list, it is worth noting that all of these factors need to be deployable as assessment tools within your originations system so it is key that you ensure your system can manage them. If you are on the look out for a loans origination system, then look no further than Principa’s AppSmart. If you are looking for a decision engine to manage your scorecards, policy rules and terms of business then take a look at our DecisionSmart business rules engine. AppSmart and DecisionSmart are part of Principa’s FinSmart Universe allowing for effective credit management across the customer life-cycle.  The different risk elements within a business credit application 1) Risk of proprietors For smaller organisations the risk of the business is inextricably linked to the financial well-being of the proprietors. How small is small? The rule of thumb is companies with up to two to three proprietors should have their proprietors assessed for risk too. This fits in with the SME segment. What data should be looked at? Generally in countries with mature credit bureaux, credit data is looked at including the score (there is normally a score cut-off) and then negative information such as the existence of judgements or defaults; these are typically used within policy rules. Those businesses with proprietors with excessive numbers of “negatives” may be disqualified from the loan application. Some credit bureaux offer a score of an individual based on the performance of all the businesses with which they are associated. This can also be useful in the credit risk assessment process. Another innovation being adopted internationally is the use of psychometrics in credit evaluation of the proprietors. To find out more about adopting credit scoring, read our blog on how to adopt credit scoring.   2) Risk of business The risk of the business should be managed through both scores and policy rules. Lenders will look at information such as the age of company, the experience of directors and the size of company etc. within a score. Alternatively, many lenders utilise the business score offered by credit bureaux. These scores are typically not as strong as consumer scores as the underlying data is limited and sometimes problematic. For example, large successful organisations may have judgements registered against their name which, unlike for consumers, is not necessarily a direct indication of the inability to service debt.   3) Reason for loan The reason for a loan is used more widely in business lending as opposed to unsecured consumer lending. Venture capital, working capital, invoice discounting and bridging finance are just some of many types of loan/facilities available and lenders need to equip themselves with the ability to manage each of these customer types whether it is within originations or collections. Prudent lenders venturing into the SME space for the first time often focus on one or two of these loan types and then expand later – as the operational implication for each type of loan is complex. 4) Financial ratios Financial ratios are core to commercial credit risk assessment. The main challenge here is to ensure that reliable financials are available from the customer. Small businesses may not be audited and thus the financials may be less trustworthy.   Financial ratios can be divided into four categories: Profitability Leverage Coverage Liquidity Profitability can be further divided into margin ratios and return ratios. Lenders are frequently interested in gross profit margins; this is normally explicit on the income statement. The EBIDTA margin and operating profit margins are also used as well as return ratios such as return on assets, return on equity and risk-adjusted-returns. Leverage ratios are useful to lenders as they reflect the portion of the business that is financed by debt. Lower leverage ratios indicate stability. Leverage ratios assessed often incorporate debt-to-asset, debt-to-equity and asset-to-equity. Coverage ratios indicate the coverage that income or assets provide for the servicing of debt or interest expenses. The higher the coverage ratio the better it is for the lender. Coverage ratios are worked out considering the loan/facility that is being applied for. Finally, liquidity ratios indicate the ability for a company to convert its assets into cash. There are a variety of ratios used here. The current ratio is simply the ratio of assets to liabilities. The quick ratio is the ability for the business to pay its current debts off with readily available assets. The higher the liquidity ratios the better. Ratios are used both within credit scorecards as well as within policy rules. You can read more about these ratios here. 5) Size of loan When assessing credit risk for a consumer, the risk of the consumer does not normally change with the change of loan amount or facility (subject to the consumer passing affordability criteria). With business loans, loan amounts can range quite dramatically, and the risk of the applicant is normally tied to the loan amount requested. The loan/facility amount will of course change the ratios (mentioned in the last section) which could affect a positive/negative outcome. The outcome of the loan application is usually directly linked to a loan amount and any marked change to this loan amount would change the risk profile of the application.   6) Risk of industry The risk of an industry in which the SME operates can have a strong deterministic relationship with the entity being able to service the debt. Some lenders use this and those who do not normally identify this as a missing element in their risk assessment process. The identification of industry is always important. If you are in manufacturing, but your clients are the mines, then you are perhaps better identified as operating in mining as opposed to manufacturing. Most lenders who assess industry, will periodically rule out certain industries and perhaps also incorporate industry within their scorecard. Others take a more scientific approach. In the graph below the performance of an industry is tracked for two years and then projected over the next 6 months; this is then compared to the country’s GDP. As the industry appears to track above the projected GDP, a positive outlook is given to this applicant and this may affect them favourably in the credit application.                   7) Risk of Region   The last area of assessment is risk of region. Of the seven, this one is used the least. Here businesses,  either on book or on the bureau, are assessed against their geo-code. Each geo-code is clustered, and the projected outlook is given as positive, static or negative. As with industry this can be used within the assessment process as a policy rule or within a scorecard.   Bringing the seven risk categories together in a risk assessment These seven risk assessment categories are all important in the risk assessment process. How you bring it all together is critical. If you would like to discuss your SME evaluation challenges or find out more about what we offer in credit management software (like AppSmart and DecisionSmart), get in touch with us here.