About the Data

All crime data is sourced from the Washington DC Metropolitan Police Department's Crime Map website. You can visit the site and download the data yourself here: crimemap.dc.gov

Before discussing my methodology and handling of the data, I must point out the easy-to-overlook concept that what you put in is what you get out. This data includes only crimes that were reported to MPD, and therefore many crimes are most likely missing from this map. There is a high chance that certain crimes get reported more often in certain neighborhoods versus others depending on the type of crime, the relationship between the victim and the offender, and the level of trust toward the police. It is important to keep these points in mind when analyzing the map.

Data has been updated through end-2013.

Notes and Methodology

In order to visualize point data in map form, I used the geographical coordinate data associated with each point to create a shapefile. Data from 2007 - 2010 included X and Y block coordinates which use the Maryland State Plane meters NAD 83 map projection. Data from 2011 - 2013 included simple latitude and longitude coordinates. For points which had no associated geographical coordinates, I geocoded their locations using any address information included. Any points that included no geographical coordinate data and no address data were deleted. This included ~40 data points.

Points often overlap due to imprecise geographic information. The location is usually generalized to a certain block of a street rather than giving the exact location, and thus any crime points on that block overlap. I have styled the points to be partially transparent so that the user can see if there are points below, though they will not be able to hover over that point to get more details as it is covered by another point. That said, data quality varies from year to year. Some years have more specific geographic data, giving the appearance that crime increased that year. This is due to the points being more spread out to their precise locations, rather than being lumped on top of one another at the center of the block.

For the point data in the year 2013, I used a method called "jittering" in order to avoid the issues with overlap mentioned above. Using a technique outlined here, I selected points with 1 or more exact geographic duplicates and relocated them at a random distance within 50 meters of the original point. It is noted in the tooltips for each point whether or not that point has been displaced. Although this method creates more geographic imprecision, it gives a clearer image of the frequency of incidents in areas with high concentrations of crime. Jittering of points does not affect aggregated statistics for chloropleth maps.

Other parts of data vary in quality by year as well. I decided to include the "Type" or "Method" attribute in all years for any applicable crimes, despite the fact that in some years (2012 specifically) it seems that this column was mostly populated by "Other(s)" rather than giving the actual method attribute.

Time of day data was provided only in data from 2011 - 2013.

For the chloropleth maps, statistics were gathered by overlaying the point data with census block data and tacking the block number to each crime. This allowed me to aggregate data associated with each census tract. In the map showing total crime per 100 people per census tract, I used the total population per census tract according to the 2010 Census, which renders numbers not in year 2010 slightly inaccurate.

Notes on the Maps

Maps were made using TileMill styling software, are hosted on MapBox, and are embedded on this website using the MapBox API (version 0.6.7).

There is at least one known error in the map functionality. If a user clicks on a crime incident point, the tool tips box will become static, which is the intended result. However, when the user tries to exit the tooltips display box, he or she must click several times before the box disappears. This is an error that I am looking into. I hope that this does not impede the usefulness of the map for the users too much at this time.