Thursday, 9 January 2014

The analytical crime watch; why your 13-year-old shouldn’t have a high-end mobile phone

As an inhabitant of the metropolitan area of Rotterdam I am always curious to know how the city is doing, especially with respect to community safety. In Rotterdam community safety has improved in the past 12 years. Among other things, the number of incidents has gone down significantly and the resolution rates for high impact crime (Burglary, Street robbery and Robbery) have never been higher. Reasons for this decline are better deployment of the police force, the introduction of city guards and city marines (officials, appointed to improve tough safety problems in a certain part of the city), and the engagement of local people.  A few weeks ago mayor Aboutaleb of Rotterdam presented a plan to further improve community safety in the city, #Veilig010. By investing €108 million in the next 4 years the plan is aimed to further lower the number of high impact crimes and improve the sense of security in Rotterdam.  

Being a data addict, in favour of fact based decision making, I expected the #Veilig010 plan to be filled with crime statistics and figures supporting the objectives and crime fighting measures mentioned in the plan. However, that expectation proved to be wrong. When spending €108 million I expected a more rigorous approach to analysing and fighting high impact crime. To find out more about the current state of community safety I decided to do some research, using the data from the Rotterdam Open Data site, and some investigative analytics with the objective to create a clear view on high impact crime in Rotterdam.  In this blog I’ve put on my data journalist/scientist hat, to share the insights I gained from the analysis.


Quetelet, a Belgian statistician who is also responsible for developing the BMI index, already noted in the early 1800’s that crime rates have a pattern. He indicates that “the seasons in their course exercise a very marked influence: thus during summer the greatest number of crimes against persons are committed.” What is interesting is that street robbery in Rotterdam has a different pattern. It peaks in Q1 and Q4 and is low in summer. Note that the number of reported incidents in Q4 of 2012 shows a sharp decline. I’m not sure whether this is a true fact or is due to missing data. A statistical test proved the pattern of street robberies to be different from a random series or the even distribution of incidents over the year. So it’s safe to state that the seasonality in the pattern is real.

Besides a seasonal pattern, street robberies also show a distinct pattern over the days of the week and during the time of day. The majority of the street robberies happen after 12:00 with the highest number of incidents between 18:00 and 23:49. The table shows the distribution for 2012, but is similar for 2011. Of all weekdays, Friday seems to be the favourite day for street robbers, making Friday between 18:00 and 23:59 the most dangerous moment to be out on the streets of Rotterdam. In analysing the distribution of high impact crimes over the weekdays for both 2011 and 2012 a (statistical significant) shift towards more offences during weekends in 2012 was found. Police in the streets at the moment at which high impact crime is most likely to occur will reduce the amount of incidents or at least increase the probability of catching the offenders. Knowing the distribution of high impact crime over the year, during the week and day therefore enables more effective deployment of the available police forces.


Next I analysed the age and gender of the victims of street robbery and found that the number of incidents decreased as the age of the victim increased. I expected the elderly to be the prime victims (easy target), but the data show a different picture. Surprisingly, 32% of the victims are between 12 and 17 years old. Also, 60% of all the victims are male (all 2012 figures). The distribution of the number of victims per age category differs (statistical significant) for males and females. Drilling down on the time of day and day of the week, most of the female victims are robbed between 18:00-23:59 on a Friday. For men multiple peaks in the number of street robberies are found, including high incident periods on Friday between 18:00-23:59 and on Saturday and Sunday between 00:00 and 5:59. The above results are useful in informing the right people on the potential danger of street robbery and what you can do to prevent it. It can also support counter measures to reduce the number of street robberies, like presence of police or city guards at the right time in areas where high risk age categories go, for example nightlife areas or schools.


After finding out the when and who, I analysed what was robbed. From the above bar chart it is directly clear that a mobile phone, a bag and a wallet are the top 3 items. In 2011, in about 16% of the incidents nothing was stolen. That figure dramatically decreased to about 5% in 2012.  My guess (or hope?) is that in 2011 the registration of stolen items was not sound enough; otherwise the success rate of the robbers has gone up. That would be something to worry about. When Comparing 2011 and 2012 a (statistically) significant shift towards mobile phones is found. Looking at the breakdown of brands the usual suspects come forward, Blackberry, Samsung and IPhone. In 2011, a Blackberry was stolen in nearly 60% of the cases, at that time a popular brand. In 2012 this shifted towards IPhone and Samsung. Given that 32% of the victims are between 12 and 17 years of age, not giving in to the wish of your kids to have a high-end mobile like an iPhone or Samsung might be a good prevention measure. Better settle for an Acer, the least stolen brand in this case.


Now that we know the when, who and what, the next question that pops up is where are these street robberies occurring?  The map shows all the places were street robberies have taken place. Based on this map it’s very difficult to deduct any useful insights. Some exploratory spatial data analysis of the point pattern will bring the insights we are looking for. By clustering the point pattern by postal code, the postal code areas with a high number of street robberies can be identified. The straightforward count of the number of street robberies per postal code area shows that the more dense populated areas have more street robberies, as is to be expected. But are these areas really the risky places?

Using postal code areas as a grid to cluster the point pattern allows for combining the street robbery data with other demographics, like number of inhabitants of the area, type of the area or income distribution as this information usually is available on a postal code level.  A better measure to identify risky places (hot spots) is to link the number of robberies to the number of inhabitants of the area, than a totally different picture arises. The less inhabited and more remote parts of the city come forward. From this simple but effective spatial analysis police can learn where most of the street robberies are taken place but also where the risky places are. This can support the deployment of police forces in the city to further reduce the number of street robberies and increase the sense of security in the city.

There are many more insights to be gained from the available data, for example by taking the surroundings of the place of the robbery into account. The above approach and visualisation of the data can help policy makers understand high impact crime better and identify the factors of interest in finding out the who, when, where, how and what questions, maybe even why. By visualizing the data, answers to questions come forward you didn’t know you had. Based on these insights fact based counter measures can be developed, resulting in a more rigorous approach to fighting high impact crime, the approach for #veilig010 next time?

Note : Data analysis, statistical testing and visualisations were all done in R