The document provides an overview of data journalism including what it is, sources for finding data, and tools for analyzing and visualizing data. It discusses scraping data from websites, using tools like Google searches, spreadsheets, and APIs to extract structured data. Ethical considerations around scraping are also mentioned. The document concludes with assigning students to group blogs and individual strategies focusing on different aspects of online journalism.
18. “ The Tribune’s more than three dozen interactive databases , collectively have drawn three times as many page views as the site’s stories . [75% of traffic]” http://bit.ly/dj2dmz
24. Data.gov.uk Guardian datastore Openlylocal,Open Corporates, Open Charities, Who's Lobbying etc. FOI requests (WDTK), disclosure logs Books - British Political Facts Finding
25. GetTheData.org WDMMG forums MySociety mailing lists Open Data Cookbook Wolfram Alpha forum Finding – data communities
26.
27. Government - national and local 'Monitors' - regulators & other bodies Charities, pressure groups Institutions - academic, scientific, health Business, finance Media, entertainment, sport Other secondary sources
28. Site:gov.uk (etc) Filetype:pdf (etc) Imagine the page you hope to find, including jargon etc. Database contents are invisible Google News alerts: report OR review Advanced search
29. "quotes search for exact phrases" "disclosure logs" site:nhs.uk + ensures page contains word: +logs - omits results with word: -wooden * wildcard, e.g. "deaths * custody" ~ synonyms, e.g. ~deaths Advanced search
30.
31.
32. RSS, XML, JSON, RDF - and APIs Scraperwiki Outwit Hub Yahoo! Pipes Spreadsheet formulae (look them up) Feeds and scrapers
38. "A problem for sites who want to provide privacy while allowing new users to join easily. Scraping services may constitute a violation of terms of service; tactics often resemble a denial-of-service attack or a security exploit." Ethics
42. Books Darrell Huff - How To Lie With Statistics Blastland & Dilnot - The Tiger That Isn't Donna Wong - The WSJ Guide to Information Graphics Brian Suda - A Practical Guide to Designing with Data
44. Enough time? 10 credits = 100 hours Lectures = 15 hours Group blog = 60 hours (75%) Strategy = 20 hours (25%) (Some in labs) + 5 hours on other issues
45. Enough time? Blog Just an example: 10 posts ranging from simple links to interviews, analysis, experiment 5.5 hours ave per week x10 weeks = 55 hours + 5 hours to write evaluation
46. Enough time? Strategy Just an example: 12.5 hours researching community 30 mins per week x10 weeks with community (2.5 hours) 5 hours analysis & write up