SlideShare ist ein Scribd-Unternehmen logo
1 von 32
David Herzog
Missouri School of Journalism and NICAR
   Locating the data

   Obtaining the data

   Evaluating the data

   Working with the data

   Visualizing the data
 “Database state of mind”


 Data has to exist. Where?
  Online
  Offline
 Government websites
  Data.gov
  U.S. Census Bureau
  FDIC
  Missouri Data Portal
  Missouri Accountability Portal
 U.S. agency FOIA pages
  Drug Enforcement Administration


 NGO sites
  Right-to-Know Network
  OpenMissouri.org
  NICAR database library
  ALA state agency databases wiki
 Commercial services
  Socrata
  Infochimps
  Geocommons
  Foreclosure Radar
  Oil Price Information Service
  Search Systems
  Junar
 Academic data catalogs
  ICPSR


 Forms
  Forms.gov
  Web forms
   ▪ Columbia parade permits
 Records retention schedules


 Reports
  State auditor
  U.S. Government Accountability Office
  U.S. Inspectors General
 Google advanced search
  Look for data files
  Look for key words
  Look only on government sites
 Data entry
   In the field
   At the office


 Printouts/reports


 Inspection forms
 Download it


 Write or request a scraper with ScraperWiki


 Convert a PDF with
   CometDocs
   Zamzar


 Just ask for it
 U.S. Freedom of Information Act
  Passed in 1966
  Amended in 1996 to include electronic records


 State open-records statutes
  Missouri Sunshine Law
 Get the roadmap!
  Record layout
  File layout
  Data dictionary
  Code sheet


 Metadata
  Data about the data
 Look at it immediately when you get it
  It is what you asked for/expected?
  How many rows/records of data?
  Is the file format OK?
 Does it look too good to be true?
 Beware of missing information
 Who collected the information?
 How? What are their methods?
 Why?
 What is their agenda?
 Who supports them financially or otherwise?
 Notepad++ for PCs
 TextMate for Mac
 Always keep original file


 Never overwrite data columns


 Tools
   Spreadsheets
   Database managers
   Google Refine
   Programming languages
 Raw numbers, without context, rarely are
 interesting.

 Ask: Compared to what?
 Raw (amount) change
   New-Original


 Percent change
   Change/Original


 Per capita rates
   Per person
   Per x people
 Percent of total
   Individual/Total


 Ratio
   Apples/oranges


 Averages
   Mean
   Median
 Be curious!
 Cut out small slices
 Spreadsheets for simple math and
  comparisons
 Spreadsheets for pivot tables
 Database managers for more robust analysis
 Always ask: Is this correct?
 Online software platforms


 Desktop software
 Contact David Herzog at


  herzogd@missouri.edu
  Twitter: @davidherzog

Weitere ähnliche Inhalte

Was ist angesagt?

Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...News Leaders Association's NewsTrain
 
Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014St. Thomas University Library
 
Everything Except Taxes
Everything Except TaxesEverything Except Taxes
Everything Except Taxeslmantle
 
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions AnalysisState of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions AnalysisNeo4j
 
Where to Find Data Sets
Where to Find Data SetsWhere to Find Data Sets
Where to Find Data SetsAnnaCave2
 
Locating scientific government information on the web
Locating scientific government information on the webLocating scientific government information on the web
Locating scientific government information on the webShannon Lynch
 
DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...DOILibrary1151
 
Trellis Pitch Deck
Trellis Pitch DeckTrellis Pitch Deck
Trellis Pitch DeckDrewThaler
 
Data can only dance with its music NICAR17
Data can only dance with its music NICAR17Data can only dance with its music NICAR17
Data can only dance with its music NICAR17J T "Tom" Johnson
 
National latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinarNational latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinarMatthew Von Hendy
 
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...News Leaders Association's NewsTrain
 
Email list of multi millionaires
Email list of multi millionairesEmail list of multi millionaires
Email list of multi millionairesmbrown012
 
Open Data Sources for Grants
Open Data Sources for GrantsOpen Data Sources for Grants
Open Data Sources for Grantsjasonparker83
 
Gale Infotrac Update
Gale Infotrac UpdateGale Infotrac Update
Gale Infotrac Updatemlincoln
 

Was ist angesagt? (19)

Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
Data-driven enterprise off your beat - Doug Caruso - Columbus, Ohio, NewsTrai...
 
Gov Docs Overview
Gov Docs Overview Gov Docs Overview
Gov Docs Overview
 
Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014Finding and using government and legal resources - Spring 2014
Finding and using government and legal resources - Spring 2014
 
Everything Except Taxes
Everything Except TaxesEverything Except Taxes
Everything Except Taxes
 
Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions AnalysisState of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
State of Florida Neo4J Graph Briefing -Payments to Prescriptions Analysis
 
Where to Find Data Sets
Where to Find Data SetsWhere to Find Data Sets
Where to Find Data Sets
 
Locating scientific government information on the web
Locating scientific government information on the webLocating scientific government information on the web
Locating scientific government information on the web
 
DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...DOI Library Training Session Presentation - Locating Scientific Government In...
DOI Library Training Session Presentation - Locating Scientific Government In...
 
Trellis Pitch Deck
Trellis Pitch DeckTrellis Pitch Deck
Trellis Pitch Deck
 
Data can only dance with its music NICAR17
Data can only dance with its music NICAR17Data can only dance with its music NICAR17
Data can only dance with its music NICAR17
 
National latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinarNational latina researchers network supercharge your search 2015 webinar
National latina researchers network supercharge your search 2015 webinar
 
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
Data-driven enterprise off your beat - Sarah Cohen - Phoenix NewsTrain - Apri...
 
Spj110509
Spj110509Spj110509
Spj110509
 
Email list of multi millionaires
Email list of multi millionairesEmail list of multi millionaires
Email list of multi millionaires
 
Open Data Sources for Grants
Open Data Sources for GrantsOpen Data Sources for Grants
Open Data Sources for Grants
 
Using the Web as an Investigative Reporting Tool
Using the Web as an Investigative Reporting ToolUsing the Web as an Investigative Reporting Tool
Using the Web as an Investigative Reporting Tool
 
Gale Infotrac Update
Gale Infotrac UpdateGale Infotrac Update
Gale Infotrac Update
 

Ähnlich wie A crash course in data for information graphics

Cil2013 searcher academylinks
Cil2013 searcher academylinksCil2013 searcher academylinks
Cil2013 searcher academylinksMarcy Phelps
 
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17News Leaders Association's NewsTrain
 
Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019News Leaders Association's NewsTrain
 
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18News Leaders Association's NewsTrain
 
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and BusinessFSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and BusinessLorri Mon
 
Data Librarianship
Data LibrarianshipData Librarianship
Data LibrarianshipLynda Kellam
 
BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]kstymest
 
Searchthewebtutorial2014
Searchthewebtutorial2014Searchthewebtutorial2014
Searchthewebtutorial2014Joyce Miller
 
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...News Leaders Association's NewsTrain
 
Federal Social Statistics
Federal Social StatisticsFederal Social Statistics
Federal Social Statisticskingv
 
Best Business Sources
Best Business SourcesBest Business Sources
Best Business SourcesMarcy Phelps
 
Database fundamentals
Database fundamentalsDatabase fundamentals
Database fundamentalscrystalpullen
 
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...News Leaders Association's NewsTrain
 

Ähnlich wie A crash course in data for information graphics (20)

Umhoefer: Data-driven enterprise - handout
Umhoefer: Data-driven enterprise - handoutUmhoefer: Data-driven enterprise - handout
Umhoefer: Data-driven enterprise - handout
 
Cil2013 searcher academylinks
Cil2013 searcher academylinksCil2013 searcher academylinks
Cil2013 searcher academylinks
 
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
Data-driven enterprise off your beat - Steve Doig - Seattle NewsTrain - 11.11.17
 
Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019Data driven enterprise off your beat - denver news train - april 11-12, 2019
Data driven enterprise off your beat - denver news train - april 11-12, 2019
 
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
Data-driven stories off your beat - Mark Nichols - Muncie NewsTrain - 3.24.18
 
Ona 2012
Ona 2012Ona 2012
Ona 2012
 
Overview of the Census - Doig
Overview of the Census - DoigOverview of the Census - Doig
Overview of the Census - Doig
 
Discovering and mapping your community needs
Discovering and mapping your community needsDiscovering and mapping your community needs
Discovering and mapping your community needs
 
lecture10.ppt
lecture10.pptlecture10.ppt
lecture10.ppt
 
Legal Apps and Websites
Legal Apps and WebsitesLegal Apps and Websites
Legal Apps and Websites
 
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and BusinessFSU SLIS Week 14 Intro to Info Services: Health, Law and Business
FSU SLIS Week 14 Intro to Info Services: Health, Law and Business
 
Data Librarianship
Data LibrarianshipData Librarianship
Data Librarianship
 
BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]BUSI 3460U - Fall 2010 [complete]
BUSI 3460U - Fall 2010 [complete]
 
Searchthewebtutorial2014
Searchthewebtutorial2014Searchthewebtutorial2014
Searchthewebtutorial2014
 
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
Data-Driven Enterprise off Your Beat - Matt Wynn - Lincoln, Nebraska, NewsTra...
 
Federal Social Statistics
Federal Social StatisticsFederal Social Statistics
Federal Social Statistics
 
Best Business Sources
Best Business SourcesBest Business Sources
Best Business Sources
 
Database fundamentals
Database fundamentalsDatabase fundamentals
Database fundamentals
 
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
Data-Driven Enterprise off Your Beat by Manuel Torres - Monroe, La., NewsTrai...
 
ACP Digging Deeper
ACP Digging DeeperACP Digging Deeper
ACP Digging Deeper
 

Mehr von David Herzog

Interactive mapping for journalists
Interactive mapping for journalistsInteractive mapping for journalists
Interactive mapping for journalistsDavid Herzog
 
Analytic mapping 2013
Analytic mapping 2013Analytic mapping 2013
Analytic mapping 2013David Herzog
 
First look: Political AdVault
First look: Political AdVaultFirst look: Political AdVault
First look: Political AdVaultDavid Herzog
 
Resources for Missouri open records
Resources for Missouri open recordsResources for Missouri open records
Resources for Missouri open recordsDavid Herzog
 
Connecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.orgConnecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.orgDavid Herzog
 
Mapping the news 2012
Mapping the news 2012Mapping the news 2012
Mapping the news 2012David Herzog
 
Web 2.0 tools for data journalists
Web 2.0 tools for data journalistsWeb 2.0 tools for data journalists
Web 2.0 tools for data journalistsDavid Herzog
 

Mehr von David Herzog (7)

Interactive mapping for journalists
Interactive mapping for journalistsInteractive mapping for journalists
Interactive mapping for journalists
 
Analytic mapping 2013
Analytic mapping 2013Analytic mapping 2013
Analytic mapping 2013
 
First look: Political AdVault
First look: Political AdVaultFirst look: Political AdVault
First look: Political AdVault
 
Resources for Missouri open records
Resources for Missouri open recordsResources for Missouri open records
Resources for Missouri open records
 
Connecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.orgConnecting to state data using OpenMissouri.org
Connecting to state data using OpenMissouri.org
 
Mapping the news 2012
Mapping the news 2012Mapping the news 2012
Mapping the news 2012
 
Web 2.0 tools for data journalists
Web 2.0 tools for data journalistsWeb 2.0 tools for data journalists
Web 2.0 tools for data journalists
 

Kürzlich hochgeladen

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

A crash course in data for information graphics

  • 1. David Herzog Missouri School of Journalism and NICAR
  • 2. Locating the data  Obtaining the data  Evaluating the data  Working with the data  Visualizing the data
  • 3.  “Database state of mind”  Data has to exist. Where?  Online  Offline
  • 4.  Government websites  Data.gov  U.S. Census Bureau  FDIC  Missouri Data Portal  Missouri Accountability Portal
  • 5.  U.S. agency FOIA pages  Drug Enforcement Administration  NGO sites  Right-to-Know Network  OpenMissouri.org  NICAR database library  ALA state agency databases wiki
  • 6.  Commercial services  Socrata  Infochimps  Geocommons  Foreclosure Radar  Oil Price Information Service  Search Systems  Junar
  • 7.  Academic data catalogs  ICPSR  Forms  Forms.gov  Web forms ▪ Columbia parade permits
  • 8.  Records retention schedules  Reports  State auditor  U.S. Government Accountability Office  U.S. Inspectors General
  • 9.  Google advanced search  Look for data files  Look for key words  Look only on government sites
  • 10.
  • 11.  Data entry  In the field  At the office  Printouts/reports  Inspection forms
  • 12.  Download it  Write or request a scraper with ScraperWiki  Convert a PDF with  CometDocs  Zamzar  Just ask for it
  • 13.  U.S. Freedom of Information Act  Passed in 1966  Amended in 1996 to include electronic records  State open-records statutes  Missouri Sunshine Law
  • 14.  Get the roadmap!  Record layout  File layout  Data dictionary  Code sheet  Metadata  Data about the data
  • 15.  Look at it immediately when you get it  It is what you asked for/expected?  How many rows/records of data?  Is the file format OK?
  • 16.  Does it look too good to be true?  Beware of missing information  Who collected the information?  How? What are their methods?  Why?  What is their agenda?  Who supports them financially or otherwise?
  • 17.  Notepad++ for PCs  TextMate for Mac
  • 18.
  • 19.
  • 20.
  • 21.  Always keep original file  Never overwrite data columns  Tools  Spreadsheets  Database managers  Google Refine  Programming languages
  • 22.  Raw numbers, without context, rarely are interesting.  Ask: Compared to what?
  • 23.  Raw (amount) change  New-Original  Percent change  Change/Original  Per capita rates  Per person  Per x people
  • 24.  Percent of total  Individual/Total  Ratio  Apples/oranges  Averages  Mean  Median
  • 25.  Be curious!  Cut out small slices  Spreadsheets for simple math and comparisons  Spreadsheets for pivot tables  Database managers for more robust analysis  Always ask: Is this correct?
  • 26.  Online software platforms  Desktop software
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.  Contact David Herzog at  herzogd@missouri.edu  Twitter: @davidherzog