Perfecting the Crime Machine

Source: Deep Learning on Medium

Perfecting the Crime Machine

TL;DR: I attempted to understand crime patterns around my campus by using machine learning. Check out the paper here to get more info on how I did it.


I go to Drexel University in Philadelphia and the campus is at the heart of the city. I will show you something terrible.

These are crime notifications from Drexel’s automatic crime report system and it just shows how unsafe it can get around here. After being here a couple of years now, (4 years at the time of this article) one learns how to avoid hotspots,what paths to take at night and what blocks to avoid when you walk to your apartment.

I am not writing this to make a bad reputation about Drexel. If anything, me attempting to take an action and doing an analysis against the harsh reality of city crime life is something that everyone could do and should do.


First, I wanted to define the problem. What are some hotspots around campus that I should be careful of? Then, I needed to define what a hotspot is. For this case, 100 crime points that are closest to each other over all years was a hotspots for me. To make more robust models, I also included historical data up untill back to 2005.


So, at this point, I needed visuals to assess how well we were doing around campus. So, I plotted crime points. I plotted the crime disctribution as well.

This distribution belongs to all crimes that happend in Philly not just Drexel campus.

I plotted the crime points and now we can see which ones are around or close to the Drexel campus.

Let’s put the legend on the map and show where Drexel campus and off-campus student houses are.

As you see there are several very dense areas that are intertwined with each other. Honestly, all Drexel campus seems like big crime bubble. The safest area is the black part, which happens to be the Schuylkill river. So, if you want to be safe, just live on the river 😅.

Then, I ran a clustering algorithm called k-means (An unsupervised machine learning algorithm) because I wanted to see the hotspots that I defined above in a distinguishable way on the map and see their cluster centers.

So at the figure above, each point that you see actually is the center of a cluster, which has 100 crime points regardless of the crime type inside.

So, if all crime points are like this:

Then, running the clustering algorithm produces this:

where each big red bubble is the cluster center.

Chestnut and Walnut St. has definitely one and there are about 4 hotspots in Powelton village between 34th St. and 31st st. and Market St. and Spring Garden. These areas are densely populated with students😥 😮. Honestly, I would see more Public Safety officers patroling the area especially during 6pm-12am, which happens to be a peak hour and most students leave campus to back to their homes.

The appendix section of the paper has many plots that would show several crime types’ counts aggregated over hours, months and years. I will include some here thinking that people won’t read the full paper after blog post.