This presentation covers the requirements to get started with HunchLab 2.0's predictive policing system. It starts discussing technical requirements (security, authentication) and then proceeds to discuss guidelines for configuring meaningful predictive models of crime. The presentation concludes with information about related geographic and temporal datasets that are useful in forecasting crime with recommendations on how to prioritize data sets to use in HunchLab.
7. Software as a Service Model
• Subscription
–
–
–
–
–
Bug fixes
Updates
Hosting / backups / etc.
2nd tier support
Training
• Amazon Web Services infrastructure
– High availability
– Elastic resources
• User load
• Model building processes
8. AWS Infrastructure & Security
• AWS data centers
– Data residency
• US or EU
– Physical security
• AWS employees with permission / 2 factor auth
– Logical access
• Azavea employees with permission / 2 factor auth
– Redundant network / power
– Continuous penetration testing
– 3rd party evaluations
• Best-of-breed services
19. Required Data
• Boundaries
– ShapeFile format
– Uploaded in application
– Types
• Jurisdiction boundary (required)
• Organizational layers (divisions, districts, etc.)
• Event data (crimes, calls for service)
– CSV format
– Uploaded via API
20. Required Data
• Event data (crimes, calls for service)
– CSV format
• First row is headers with names as below
– Columns
• datasource (string) - identifies data source
– example: rms
• id (string) - unique identifier for event within data source
– example: 1
• class (string) - class(es) for event separated by pipe
– example: agg|1|23
• pointx (numeric) – longitude
– example: -105.0255345
• pointy (numeric) – latitude
– example: 39.7287494
• address (string) - street address
– example: 340 N 12th Street
21. Required Data
• Event data (crimes, calls for service)
– Columns (continued)
• datetimefrom (ISO8601 datetime) - start time
– example: 2012-01-01T13:00:00Z
• datetimeto (ISO8601 datetime) - end time
– example: 2012-01-01T13:00:00Z
• report_time (ISO8601 datetime) - report time
– example: 2012-01-01T13:00:00Z
• last_updated (ISO8601 datetime) - record update time
– example: 2012-01-01T13:00:00Z
22. Required Data
• Event data (crimes, calls for service)
– Upload via API
• Allows automation of upload process
• Workflow
–
–
–
–
Query your database for recent changes
Transform into CSV format
POST CSV to HunchLab URL
Check for import to complete
– Example scripts
• https://github.com/azavea/azavea-hunchlab-examples
25. Crime Models
• Generate predictions
– Automatically built on a regular basis
• Represents one or more crime classes
• Choices to make:
–
–
–
–
Crime classes
Color
Severity weight
Patrol Efficacy
26.
27. Crime Models
• Which crimes to model?
– Start with serious events
• Part 1s, etc.
– Add ‘problem’ crime types for your department
• How many models?
– Aim for up to 10 models
• Single crime type vs. combination?
– Does the event happen often enough on its own?
• Example: Homicides as part of Violence
– Is the strategy the same as related crime types?
• Example: Homicides vs. Aggravated Assaults
30. Crime Models
• Severity weights
– How important is it to prevent these crimes?
– RAND cost of crime
• http://www.rand.org/content/dam/rand/pubs/occasional_papers/
2010/RAND_OP279.pdf
– NIH publications
• http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2835847/table/T5/
32. Crime Models
• Patrol Efficacy
– What proportion of these events are preventable via patrol
activities?
• Example: rape (stranger vs known assailant)
– How effective is patrol against the preventable events?
• Example: street crimes vs indoor crimes
– Expressed as percent (0-100%)
– Examples:
• Robbery: 50%
• Residential Burglary: 20%
• Rape: 5%
33. Crime Models
1.
2.
3.
4.
Define set of models via crime classes
Assign severity weights
Assign patrol efficacy values
Assign colors
• Overall Goal
– Craft a set of models that generate predictions for real
opportunities for your officers to prevent crime.
36. Optional Data
• Geographic POIs
– Points, lines, polygons (Shapefile)
– Examples
• Schools
• Transit stops
• Parks
• Bars
• Temporal feeds
– Schedules (CSV)
– Examples
• School calendar
• Sporting events
37. Choosing Data Sets
• Usefulness vs. Complexity
– How strong do you believe the correlation is?
• Example: bars vs hospitals
– How big is the data set?
• Example: schools vs bus stops
– How often does the data change?
• Example: hospitals vs bars
• Availability
– Start with what you have
• Police stations, fire stations, public housing
– Layer in data from other city departments
• Schools, bus stops, liquor licenses
– Fill in gaps (once things are going)
38. Choosing Data Sets
• Risk Terrain Modeling
– Literature reviews
• http://www.rutgerscps.org/pubs.htm
– Factors in 5 or more reviews:
• Drug Activity
• Bars
• Nightclubs
• Schools
• Transportation Hubs
39. Agenda
• Technical Overview
– SaaS
– Authentication
– End-user Requirements
• Setup
– Required Data
– Uploading Crime Data
– Defining Crime Models
• Additional Data Sets
40. Jeremy Heffner
HunchLab Product Manager
jheffner@azavea.com
215.701.7712
Amelia Longo
Business Development Associate
alongo@azavea.com
215.701.7715
340 N 12th St, Suite 402
Philadelphia, PA 19107
215.925.2600
info@azavea.com
www.hunchlab.com