Tom Cruise in Minority Report using predictive analytics to predict crime

It’s been almost 15 years since we saw the future of crime prevention in “Minority Report” – but today, we are beginning to see those then fictitious yet fantastical methods of predicting and preventing crime being implemented in various parts of the world. I’ll briefly mention three examples below of how analytics is already being used to prevent crime today before going into more detail on a fourth example: using analytics to prevent a criminal from re-offending.

1. PrePol: Identifying crime “Hot Spots”

In Los Angeles and in England “predictive policing” (“PrePol”) has been deployed for over two decades.  It has of course evolved over this time.  Today PrePol utilises algorithms to identify crime “hot-spots”.  The results have been positive.  In double blinded trials they have been twice as good as traditional methods of prevention.

2. Smart Grid Infrastructure: Identifying and Predicting electricity theft

Here in South Africa there has been talk about adopting “smart grids” and “smart meters” that will allow the likes of our electricity public utility company, Eskom, and municipalities to predict cable theft and identify illegal connections.  In the US, electricity theft is seen as the third largest form of theft  and the introduction of  Meter Data Management and “Smart Grid Infrastructure” means that every single second large packets of data are being generated ready to be analysed – a true Big Data problem.  This type of “Smart Grid Infrastructure” is being used also with water utilities to prevent water theft.

Read my blog post on Finding Value in Big Transaction Data.

3. Big Data Surveillance in China reaching Robocop scenario

Recently Bloomberg reported on the Chinese government’s efforts to help prevent crime through Big Data surveillance in a country where data privacy laws are limited.  Under government mandate the country’s largest state-run defense contractor is building an analytics software platform that will be able to cross-reference information from bank accounts, jobs, hobbies, consumption patterns, and footage from surveillance cameras to identify potential terrorists. As of the 1st of January of this year, Chinese authorities have been granted access to bank accounts, telecommunications and a national network of surveillance cameras called - oddly enough - Skynet. Picture a world where police officers wearing augmented reality glasses are able to identify each individual in a crowd.

Chinese government surveillance through Big Data resembles Robocop scenario

Upon identification of individuals, everything from their medical records, social media activity, demographic data and police records is pulled and run through an algorithm. From here a potential perpetrator of a crime is then identified and the police can take or plan their action.  This “Robocop” type scenario is no longer so far in the future in China.

4. Machine Learning for Predicting Re-offence

This month the "Journal of Empirical Legal Studies" published a paper which detailed a large study where machine learning had been used to determine whether it could feasibly be used to assist judges in domestic violence arraignment / bail hearings, i.e. should an arrested individual be granted bail or not.

Machine learning is an algorithm that is used to make predictions, but while doing so learns from its predictions and makes adjustments to continually improve the accuracy of its predictions.

The Study

In this study, 28,000 domestic violence arraignment hearings were assessed over the period 2007-2011 (observation period).  Then those 28,000 individuals were assessed to see whether they had re-offended or not in the next two years (outcome period).  The machine looked at over 35 characteristics, including previous convictions and charges, and demographic data, such as age and gender.  Random forests (a statistical technique) were built to assess the likelihood of re-offending.

Machine Learning algorithms predicting likelihood of re-offence

Even with the relatively small amount of data fields used, the outcomes were impressive.  Typically the courts would have granted bail to the majority of those at arraignment, and 20% were shown to have re-offended in the next 24 months.  The analysis showed that the model could have selected better with only 10% of them re-offending.

Social implications

In such matters, being able to predict a probability of re-offence is all very well, but the models are not perfect and the errors have significant social consequences.  Here courts would need to assess the impact of false-positives: identifying someone as a likely repeat offender (although they are not) and denying them bail, which could cost them their jobs and possibly their homes while being detained.  False-negatives - being released on bail and re-offending although the model predicted otherwise - are also noteworthy.  Despite the sophistication of the models, they are unlikely to be able to take into account all the subtle information available to a judge.  

Supporters however say that mistakes would be made with or without the model and if the model performs better than human intervention alone, it should not be ignored.

This is analogous to credit card transaction fraud models that can also be run using machine learning. A false-positives may lead to freezing of credit cards, which would lead to inconvenience to the credit card holder and a high number of customer services calls.  Banks need to determine their threshold for accepting questionable transactions. 


Analytics and specifically machine learning are proving to have far-reaching applications not only in industry and commerce, but also in crime-prevention and in the courts.  The fighting and management of crime empirically is in its infancy.  This study is just one of many on the go.

using machine learning in business - download guide

Image credits: Twentieth Century Fox ("Minority Report" image), Orion Pictures Corp. ("Robocop" and "The Terminator" images)