The byproduct of sericulture in different industries.pptx
Â
Finding data BBC 15
1. @PaulBradshaw, Online Journalism Blog
Birmingham City University and City University London
BBC, January 2015
Data Mining
Search, scraping, FOI and feeds
Image by Evan Long
2. 1. Search tips and tools
2. Sources and feeds
3. Data requests
4. Scraping
3. 1. Search tips and tools
2. Sources and feeds
3. Data requests
4. Scraping
4. Donât ask for what you want:
describe what you expect to
ïŹnd
Search operators
5. What text will it contain?
Where will that text be?
What text will it not contain?
Imagine the data: text
42. Do it now:
Search for a disclosure log
for a CCG
Search for spreadsheets
mentioning Andrew Mitchell
MP
43. 1. Search tips and tools
2. Sources and feeds
3. Data requests
4. Scraping
44. Audits and transparency data
Parliamentary questions
Reports, research, sources
FOI requests, disclosure logs
Press offices
Public data and databases -
scraping
45. Open data initiatives &
activism (TWFY)
Hackdays e.g. Rewired State
Public data and databases -
scraping
Crowdsourcing or surveys
Social networks
61. Do it now:
Draft an FOI request for a
local bodyâs data dictionary
Use WhatDoTheyKnow (so
others googling codes can
ïŹnd you)
62. 1. Search tips and tools
2. Sources and feeds
3. Data requests
4. Scraping
63. Automating the repetitive
gathering of data, e.g.
Multiple tables in one pageâš
Webpage tablesâš
Multiple spreadsheetsâš
Multiple PDFs
What is scraping?
86. Do it now:
Identify a website which
has multiple pages or
documents containing data
you could combine
Whereâs the structure?
Table? URL? Links?
87. 1. Search: describe the data
2. Feeds: get regular
updates
3. FOI: request detail, in CSV
format
4. Scraping: look for
structure and repetition
88. Thank you.
Image by Evan Long
@PaulBradshaw, Online Journalism Blog, HelpMeInvestigate
Birmingham City University and City University London
BBC Future Day, September 2014