SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Data 2: Interrogating,
    visualising, mashing



   Online Journalism
   City University
   Paul Bradshaw
Monday, 7 March 2011
Themes


   5 things you need to know about each
   Data journalism in action
   Walkthrough



Monday, 7 March 2011
Interrogating data




   .


Monday, 7 March 2011
Monday, 7 March 2011
5 things you need to know about
    interrogating data

   1. Data always needs cleaning up
   2. Treat the ‘source’ like a source
   3. Use the right ‘average’ and
   percentage
   4. Variation over time & space: context
   5. Spreadsheet tools are your friend -
   but always backup copies
Monday, 7 March 2011
Monday, 7 March 2011
“What the Independent have done
 is confuse the UK’s deficit with our
 debt [making] the debt problem
 look around eight times worse than
 it is. And it used the whole of its
 front page to do so.”

                        - James Ball
Monday, 7 March 2011
Monday, 7 March 2011
What is the data worth?


   Measurement doesn't answer anything if
   there's only one variable
   Statistical significance
   Sample size and selection
   Controls and the placebo effect
   Read up.
Monday, 7 March 2011
1. Variance is interesting.
 2. Variance is different for different
 variables and in different
 populations.
 3. The amount of variance is easily
 quantified.
                       - Philip Meyer, Precision Journalism


Monday, 7 March 2011
Getting data in the right form


   Data > Text to columns
   Find & replace
   Conditional formulas:
   =IF(condition, if met, if not)
   =COUNTIF(range, test)

Monday, 7 March 2011
Walkthrough: cleaning data in
    Google Refine

   Edit cells > common transforms
   Edit cells > split multi-valued cells
   Facet > text facet
   Export...


Monday, 7 March 2011
Visualising data




   .


Monday, 7 March 2011
5 things you need to know about
    visualising data

   1. Choose the chart for the purpose
   2. It can be used to spot a lead
   3. Good design is when there’s nothing
   more to take away
   4. It should be self-contained & have refs
   5. Be careful with scales and classes
Monday, 7 March 2011
or http://chartchooser.juiceanalytics.com/
Monday, 7 March 2011
Monday, 7 March 2011
Monday, 7 March 2011
What is wrong with this picture?

Monday, 7 March 2011
Monday, 7 March 2011
http://simplecomplexity.net/statistics-without-context/


Monday, 7 March 2011
http://junkcharts.typepad.com/junk_charts/trifecta-checkup/

Monday, 7 March 2011
Visualisation tools


   ManyEyes
   Tableau
   Wordle, Tagxedo
   BatchGeo
   Gephi
   Delicious.com/paulb/visualisation+tools
Monday, 7 March 2011
Walkthrough: visualising data
    with Google Gadgets

   .




Monday, 7 March 2011
Walkthrough: visualising data in
    ManyEyes

   .




Monday, 7 March 2011
Mashing data




   .


Monday, 7 March 2011
5 things you need to know about
    mashing data

   1. It is what a journalist does best
   2. Look for a point of connection: place?
   Person? Company? Date?
   3. What an API can do
   4. What APIs there are
   5. Mashups can be live, updated or
   static
Monday, 7 March 2011
Monday, 7 March 2011
Monday, 7 March 2011
Mashup tools


   Yahoo! Pipes
   OpenHeatMap
   Mapalist
   xFruits
   Scraperwiki
   Maptube
Monday, 7 March 2011
Walkthrough: making mashups
    with Yahoo! Pipes

   Inputs - Fetch Feed, CSV, Data, Page,
   YQL, Flickr, Form
   Operators - Filter, Sort, Unique, Union,
   Count, Split, Rename, Regex, Unique,
   Location extractor, URL Builder
   Outputs - Map, Gallery, List, XML, KML
Monday, 7 March 2011
Walkthrough: making mashups
    with OpenHeatMap

   Format the spreadsheet
   Publish it as CSV
   Copy link
   Paste it at OpenHeatMap
   Fix any problems

Monday, 7 March 2011
Walkthrough: grabbing geo data
    with Google Refine

   Edit column > Add column by fetching
   URLs
   Use GREL (Google Refine Expression
   Language)
   Search web for help & examples

Monday, 7 March 2011
Questions?




  .


Monday, 7 March 2011
Links


   OnlineJournalismClasses.tumblr.com
   Delicious.com/paulb/cityoj09
   Delicious.com/paulb/datajournalism
   Delicious.com/paulb/visualisation
   Delicious.com/paulb/statistics
   Delicious.com/paulb/mashups
Monday, 7 March 2011
Lab


  Before the lab: play with these
  techniques yourself, have problems,
  find solutions, raise questions. Install
  Google Refine and Tableau on your
  laptop to use.
  - Visualise, interrogate or mash data
Monday, 7 March 2011
Books


   Kaiser Fung - Numbers Rule Your World
   Ben Goldacre - Bad Science
   Donna Wong - The WSJ Guide to
   Information Graphics
   Brian Suda - A Practical Guide to
   Designing with Data
Monday, 7 March 2011

Weitere ähnliche Inhalte

Ähnlich wie Data Journalism 2: Interrogating, Visualising and Mashing

Data Journalism 2: cleaning, combining, communicating
Data Journalism 2: cleaning, combining, communicatingData Journalism 2: cleaning, combining, communicating
Data Journalism 2: cleaning, combining, communicatingPaul Bradshaw
 
Data Journalism (very abridged)
Data Journalism (very abridged)Data Journalism (very abridged)
Data Journalism (very abridged)Paul Bradshaw
 
Searching does not mean finding Stuff - Apache Solr for TYPO3
Searching does not mean finding Stuff - Apache Solr for TYPO3Searching does not mean finding Stuff - Apache Solr for TYPO3
Searching does not mean finding Stuff - Apache Solr for TYPO3Olivier Dobberkau
 
Open Data Driven Scholarly Communication in 2020
Open Data Driven Scholarly Communication in 2020Open Data Driven Scholarly Communication in 2020
Open Data Driven Scholarly Communication in 2020Philip Bourne
 
Android Development Slides
Android Development SlidesAndroid Development Slides
Android Development SlidesVictor Miclovich
 
Choosing the right Content Management System
Choosing the right Content Management SystemChoosing the right Content Management System
Choosing the right Content Management SystemRachel Andrew
 
Data Driven Innovation
Data Driven InnovationData Driven Innovation
Data Driven Innovationideas.org
 
Data Driven Innovation
Data Driven InnovationData Driven Innovation
Data Driven InnovationSimon Grice
 
IAT334-Lec02-TaskAnalysis.pptx
IAT334-Lec02-TaskAnalysis.pptxIAT334-Lec02-TaskAnalysis.pptx
IAT334-Lec02-TaskAnalysis.pptxssuseraae9cd
 
Mobility in the financial industry
Mobility in the financial industryMobility in the financial industry
Mobility in the financial industryVincent Everts
 
How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...
How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...
How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...Ronald Ashri
 
Reasoning over big data
Reasoning over big dataReasoning over big data
Reasoning over big dataOSTHUS
 
"The Reality of Digital Science"
"The Reality of Digital Science""The Reality of Digital Science"
"The Reality of Digital Science"Kaitlin Thaney
 
Koss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser appsKoss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser appsEvil Martians
 
Atlassian RoadTrip 2011 Slide Deck
Atlassian RoadTrip 2011 Slide DeckAtlassian RoadTrip 2011 Slide Deck
Atlassian RoadTrip 2011 Slide DeckAtlassian
 

Ähnlich wie Data Journalism 2: Interrogating, Visualising and Mashing (20)

Data Journalism 2: cleaning, combining, communicating
Data Journalism 2: cleaning, combining, communicatingData Journalism 2: cleaning, combining, communicating
Data Journalism 2: cleaning, combining, communicating
 
Data Journalism (very abridged)
Data Journalism (very abridged)Data Journalism (very abridged)
Data Journalism (very abridged)
 
Searching does not mean finding Stuff - Apache Solr for TYPO3
Searching does not mean finding Stuff - Apache Solr for TYPO3Searching does not mean finding Stuff - Apache Solr for TYPO3
Searching does not mean finding Stuff - Apache Solr for TYPO3
 
Open Data Driven Scholarly Communication in 2020
Open Data Driven Scholarly Communication in 2020Open Data Driven Scholarly Communication in 2020
Open Data Driven Scholarly Communication in 2020
 
Android Development Slides
Android Development SlidesAndroid Development Slides
Android Development Slides
 
Messaging patterns
Messaging patternsMessaging patterns
Messaging patterns
 
Choosing the right Content Management System
Choosing the right Content Management SystemChoosing the right Content Management System
Choosing the right Content Management System
 
Data Driven Innovation
Data Driven InnovationData Driven Innovation
Data Driven Innovation
 
Data Driven Innovation
Data Driven InnovationData Driven Innovation
Data Driven Innovation
 
IAT334-Lec02-TaskAnalysis.pptx
IAT334-Lec02-TaskAnalysis.pptxIAT334-Lec02-TaskAnalysis.pptx
IAT334-Lec02-TaskAnalysis.pptx
 
Mobility in the financial industry
Mobility in the financial industryMobility in the financial industry
Mobility in the financial industry
 
How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...
How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...
How to Make Entities and Influence Drupal - Emerging Patterns from Drupal Con...
 
Reasoning over big data
Reasoning over big dataReasoning over big data
Reasoning over big data
 
Mahout classifier tour
Mahout classifier tourMahout classifier tour
Mahout classifier tour
 
Ufi Keynote 10 Feb
Ufi Keynote 10 FebUfi Keynote 10 Feb
Ufi Keynote 10 Feb
 
"The Reality of Digital Science"
"The Reality of Digital Science""The Reality of Digital Science"
"The Reality of Digital Science"
 
ITP / SED Day 2
ITP / SED Day 2ITP / SED Day 2
ITP / SED Day 2
 
Koss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser appsKoss, How to make desktop caliber browser apps
Koss, How to make desktop caliber browser apps
 
STI Summit 2011 - Linked services
STI Summit 2011 - Linked servicesSTI Summit 2011 - Linked services
STI Summit 2011 - Linked services
 
Atlassian RoadTrip 2011 Slide Deck
Atlassian RoadTrip 2011 Slide DeckAtlassian RoadTrip 2011 Slide Deck
Atlassian RoadTrip 2011 Slide Deck
 

Mehr von Paul Bradshaw

How to work with a bullshitting robot
How to work with a bullshitting robotHow to work with a bullshitting robot
How to work with a bullshitting robotPaul Bradshaw
 
How to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in RHow to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in RPaul Bradshaw
 
ChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalismChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalismPaul Bradshaw
 
Data journalism: history and roles
Data journalism: history and rolesData journalism: history and roles
Data journalism: history and rolesPaul Bradshaw
 
Working on data stories: different approaches
Working on data stories: different approachesWorking on data stories: different approaches
Working on data stories: different approachesPaul Bradshaw
 
Visual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniquesVisual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniquesPaul Bradshaw
 
Using narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalismUsing narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalismPaul Bradshaw
 
Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)Paul Bradshaw
 
Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)Paul Bradshaw
 
Data journalism on the air: 3 tips
Data journalism on the air: 3 tipsData journalism on the air: 3 tips
Data journalism on the air: 3 tipsPaul Bradshaw
 
7 angles for data stories
7 angles for data stories7 angles for data stories
7 angles for data storiesPaul Bradshaw
 
Uncertain times, stories of uncertainty
Uncertain times, stories of uncertaintyUncertain times, stories of uncertainty
Uncertain times, stories of uncertaintyPaul Bradshaw
 
Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)Paul Bradshaw
 
Storytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reportingStorytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reportingPaul Bradshaw
 
Cognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalistsCognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalistsPaul Bradshaw
 
The 3 chords of data journalism
The 3 chords of data journalismThe 3 chords of data journalism
The 3 chords of data journalismPaul Bradshaw
 
Data journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for storiesData journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for storiesPaul Bradshaw
 
Teaching AI in data journalism
Teaching AI in data journalismTeaching AI in data journalism
Teaching AI in data journalismPaul Bradshaw
 
10 ways AI can be used for investigations
10 ways AI can be used for investigations10 ways AI can be used for investigations
10 ways AI can be used for investigationsPaul Bradshaw
 
Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)Paul Bradshaw
 

Mehr von Paul Bradshaw (20)

How to work with a bullshitting robot
How to work with a bullshitting robotHow to work with a bullshitting robot
How to work with a bullshitting robot
 
How to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in RHow to generate a 100+ page website using parameterisation in R
How to generate a 100+ page website using parameterisation in R
 
ChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalismChatGPT (and generative AI) in journalism
ChatGPT (and generative AI) in journalism
 
Data journalism: history and roles
Data journalism: history and rolesData journalism: history and roles
Data journalism: history and roles
 
Working on data stories: different approaches
Working on data stories: different approachesWorking on data stories: different approaches
Working on data stories: different approaches
 
Visual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniquesVisual journalism: gifs, emoji, memes and other techniques
Visual journalism: gifs, emoji, memes and other techniques
 
Using narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalismUsing narrative structures in shortform and longform journalism
Using narrative structures in shortform and longform journalism
 
Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)Narrative and multiplatform journalism (part 1)
Narrative and multiplatform journalism (part 1)
 
Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)Teaching data journalism (Abraji 2021)
Teaching data journalism (Abraji 2021)
 
Data journalism on the air: 3 tips
Data journalism on the air: 3 tipsData journalism on the air: 3 tips
Data journalism on the air: 3 tips
 
7 angles for data stories
7 angles for data stories7 angles for data stories
7 angles for data stories
 
Uncertain times, stories of uncertainty
Uncertain times, stories of uncertaintyUncertain times, stories of uncertainty
Uncertain times, stories of uncertainty
 
Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)Ergodic education (online teaching and interactivity)
Ergodic education (online teaching and interactivity)
 
Storytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reportingStorytelling in the database era: uncertainty and science reporting
Storytelling in the database era: uncertainty and science reporting
 
Cognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalistsCognitive bias: a quick guide for journalists
Cognitive bias: a quick guide for journalists
 
The 3 chords of data journalism
The 3 chords of data journalismThe 3 chords of data journalism
The 3 chords of data journalism
 
Data journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for storiesData journalism: what it is, how to use data for stories
Data journalism: what it is, how to use data for stories
 
Teaching AI in data journalism
Teaching AI in data journalismTeaching AI in data journalism
Teaching AI in data journalism
 
10 ways AI can be used for investigations
10 ways AI can be used for investigations10 ways AI can be used for investigations
10 ways AI can be used for investigations
 
Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)Open Data Utopia? (SciCAR 19)
Open Data Utopia? (SciCAR 19)
 

Kürzlich hochgeladen

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Kürzlich hochgeladen (20)

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Data Journalism 2: Interrogating, Visualising and Mashing

  • 1. Data 2: Interrogating, visualising, mashing Online Journalism City University Paul Bradshaw Monday, 7 March 2011
  • 2. Themes 5 things you need to know about each Data journalism in action Walkthrough Monday, 7 March 2011
  • 3. Interrogating data . Monday, 7 March 2011
  • 5. 5 things you need to know about interrogating data 1. Data always needs cleaning up 2. Treat the ‘source’ like a source 3. Use the right ‘average’ and percentage 4. Variation over time & space: context 5. Spreadsheet tools are your friend - but always backup copies Monday, 7 March 2011
  • 7. “What the Independent have done is confuse the UK’s deficit with our debt [making] the debt problem look around eight times worse than it is. And it used the whole of its front page to do so.” - James Ball Monday, 7 March 2011
  • 9. What is the data worth? Measurement doesn't answer anything if there's only one variable Statistical significance Sample size and selection Controls and the placebo effect Read up. Monday, 7 March 2011
  • 10. 1. Variance is interesting. 2. Variance is different for different variables and in different populations. 3. The amount of variance is easily quantified. - Philip Meyer, Precision Journalism Monday, 7 March 2011
  • 11. Getting data in the right form Data > Text to columns Find & replace Conditional formulas: =IF(condition, if met, if not) =COUNTIF(range, test) Monday, 7 March 2011
  • 12. Walkthrough: cleaning data in Google Refine Edit cells > common transforms Edit cells > split multi-valued cells Facet > text facet Export... Monday, 7 March 2011
  • 13. Visualising data . Monday, 7 March 2011
  • 14. 5 things you need to know about visualising data 1. Choose the chart for the purpose 2. It can be used to spot a lead 3. Good design is when there’s nothing more to take away 4. It should be self-contained & have refs 5. Be careful with scales and classes Monday, 7 March 2011
  • 18. What is wrong with this picture? Monday, 7 March 2011
  • 22. Visualisation tools ManyEyes Tableau Wordle, Tagxedo BatchGeo Gephi Delicious.com/paulb/visualisation+tools Monday, 7 March 2011
  • 23. Walkthrough: visualising data with Google Gadgets . Monday, 7 March 2011
  • 24. Walkthrough: visualising data in ManyEyes . Monday, 7 March 2011
  • 25. Mashing data . Monday, 7 March 2011
  • 26. 5 things you need to know about mashing data 1. It is what a journalist does best 2. Look for a point of connection: place? Person? Company? Date? 3. What an API can do 4. What APIs there are 5. Mashups can be live, updated or static Monday, 7 March 2011
  • 29. Mashup tools Yahoo! Pipes OpenHeatMap Mapalist xFruits Scraperwiki Maptube Monday, 7 March 2011
  • 30. Walkthrough: making mashups with Yahoo! Pipes Inputs - Fetch Feed, CSV, Data, Page, YQL, Flickr, Form Operators - Filter, Sort, Unique, Union, Count, Split, Rename, Regex, Unique, Location extractor, URL Builder Outputs - Map, Gallery, List, XML, KML Monday, 7 March 2011
  • 31. Walkthrough: making mashups with OpenHeatMap Format the spreadsheet Publish it as CSV Copy link Paste it at OpenHeatMap Fix any problems Monday, 7 March 2011
  • 32. Walkthrough: grabbing geo data with Google Refine Edit column > Add column by fetching URLs Use GREL (Google Refine Expression Language) Search web for help & examples Monday, 7 March 2011
  • 33. Questions? . Monday, 7 March 2011
  • 34. Links OnlineJournalismClasses.tumblr.com Delicious.com/paulb/cityoj09 Delicious.com/paulb/datajournalism Delicious.com/paulb/visualisation Delicious.com/paulb/statistics Delicious.com/paulb/mashups Monday, 7 March 2011
  • 35. Lab Before the lab: play with these techniques yourself, have problems, find solutions, raise questions. Install Google Refine and Tableau on your laptop to use. - Visualise, interrogate or mash data Monday, 7 March 2011
  • 36. Books Kaiser Fung - Numbers Rule Your World Ben Goldacre - Bad Science Donna Wong - The WSJ Guide to Information Graphics Brian Suda - A Practical Guide to Designing with Data Monday, 7 March 2011