2. Why project by project
• Each project will have different data
requirements / issues
• Basic rules / considerations
– Size of dataset
– Confidentiality
– Accuracy: time & location
• Examples
– 20mph zones in London
– Use of satellite imagery in surveys
4. 20mph zones on road injury
• What effect have implementation of 20mph
zones in London had on road injury between
1986 - 2006
• GIS used
– Locate 20mph zones
– Link road injury to roads
– Build dataset ready for analysis
5. Methods
• Controlled interrupted time series analysis
– Measures the change in the number of casualties on
each road in London from 1987-2006
– Control group is all “outside” roads in London
• 20 years of road casualty data (STATS19)
• Each road defined by year as
– Inside a 20mph zone
– Adjacent to a 20mph zone
– Outside 20mph zones
8. Data considerations
• Large dataset (6 million rows)
– Each stats run took 6 – 10 hours (20 runs)
• Accuracy
– Collisions not always accurate – lots of checks
– During 20 years roads physically changed
• Confidentiality
– Full road injury dataset confidential
– Confidential server too slow to handle size
– Data anonymised for use elsewhere
10. GIS and mapping in surveys
• Surveys vital part of public health studies
• GIS widely used
– Planning logistics
– Random selection of household
– Population estimation methods
– Locating house holds for return visits
– Mapping results
11. Using imagery
• Satellite images increasingly available
– Google earth
– Commercial images
– Commissioned images
• Structures visible
• On screen digitizing
13. Methods: quadrat survey
• Area split into grid
– 50 m2 grid defined
– Existing city street grid
• 15 “quadrats” (blocks) per
stratum
• Visit each structure
• Population = Population density x Area
14. Method: manual structure count
• Structures located by eye
• Type of structure determined by user
– Traditional hut
– Non residential building
• Grid used to ensure
systematic counting
• Count checked
– Missed features / errors
15. Methods: Population estimation
• Using satellite images to estimate population:
Population = n structures x n people / structure
Manual counts Small structure
occupancy survey
16. Methods: random survey
• Select & visit random
structures
• Combine pre-located
structures and GPS
• Coordinates allow structure
to be revisited easily
Survey structure
19. Data considerations
• Image data licensed
– Images licenses, not allowed to be shared
– Named users verses number of users
– Getting suitable image within time period an issue
– Not all locations have identifiable structures
• Confidentiality
– Main dataset not confidential
– Confidential survey data stored separately
• Ethics / Population security
– Dangers of mapping at risk populations
21. Don’t forget
• Time: How current is data
• Preparation: Planning is everything
• Disk space
• Confidentiality & ethics: storage & publication
• Share data wherever possible
22. Tip: Map your data
• Check data as it comes in
• Explore your data
• Use maps at every opportunity