SlideShare a Scribd company logo
1 of 43
Open data for journalists
         How it’s useful, why it matters




             Chris Taggart, OpenCorporates, NICAR, Feb 2012
About OpenCorporates
About OpenCorporates

                                            tions
                                       isdic tes
                                  7 jur
                           nies in 4 22 US sta
                     compa     clud ing
            6million         In
   wov er 3
No
A simple (but huge) goal: an
entry for every corporate
legal entity in the world
Based on the company number and jurisdiction
(no monopoly id)
A simple (but huge) goal: an
entry for every corporate
legal entity in the world
Based on the company number and jurisdiction
(no monopoly id)
A simple (but huge) goal: an
entry for every corporate
legal entity in the world
Based on the company number and jurisdiction
(no monopoly id)
The simple search
The simple search

Not to be underestimated
The simple search

Not to be underestimated
The simple search

Not to be underestimated
Massively reduces friction
(how long will it take you
to find and search
multiple jurisdictions)
The simple search

Not to be underestimated
Massively reduces friction
(how long will it take you
to find and search
multiple jurisdictions)
The simple search

Not to be underestimated
Massively reduces friction
(how long will it take you
to find and search
multiple jurisdictions)
Allows what if questions
The simple search

Not to be underestimated
Massively reduces friction
(how long will it take you
to find and search
multiple jurisdictions)
Allows what if questions
Potentially generates
stories in its own right
The simple search

Not to be underestimated
Massively reduces friction
(how long will it take you
to find and search
multiple jurisdictions)
Allows what if questions
Potentially generates
stories in its own right
Source for additional info
Source for additional info

 Addresses, filings,
 status, websites...
Source for additional info

 Addresses, filings,
 status, websites...
Source for additional info

 Addresses, filings,
 status, websites...
 Intl trademarks, UK
 govt spending,
 official notices,
 health & safety...
Source for additional info

 Addresses, filings,
 status, websites...
 Intl trademarks, UK
 govt spending,
 official notices,
 health & safety...
Source for additional info

 Addresses, filings,
 status, websites...
 Intl trademarks, UK
 govt spending,
 official notices,
 health & safety...
 Other IDs: SEC,
 CAGE, charity....
Source for additional info

 Addresses, filings,
 status, websites...
 Intl trademarks, UK
 govt spending,
 official notices,
 health & safety...
 Other IDs: SEC,
 CAGE, charity....
 Coming soon:
 lobbying registers
Reconciliation
(matching names to legal entities)


Cleans up
messy
company
names (&
previous
names) to
legal entity,
and from there
to other data
The database/platform




API: allows all
information to be
retrieved as data,
even searches
Why care about open data?
the freedom argument
Information is the currency
          of democracy
                                                       Thomas Jefferson*




* This quote has also been attributed to Ralph Nader
ATA is the currency
          Information
              D
          of democracy
                                                       Thomas Jefferson*




* This quote has also been attributed to Ralph Nader
ATA is the currency
          Information
              D
          of democracy
                                                       Thomas Jefferson*

                We live in a big data world. Our lives are
                not just governed by data, they are data




* This quote has also been attributed to Ralph Nader
ATA is the currency
          Information
              D
          of democracy
                                                       Thomas Jefferson*

                We live in a big data world. Our lives are
                not just governed by data, they are data
                The biggest databases in the world are
                private not governmental – social
                networks, search engines, finance
                companies, supermarkets...



* This quote has also been attributed to Ralph Nader
ATA is the currency
          Information
              D
          of democracy
                                                       Thomas Jefferson*

                We live in a big data world. Our lives are
                not just governed by data, they are data
                The biggest databases in the world are
                private not governmental – social
                networks, search engines, finance
                companies, supermarkets...
                These are enriched by public data that
                are only available to purchase
* This quote has also been attributed to Ralph Nader
ATA is the currency
          Information
              D
          of democracy
                                                       Thomas Jefferson*

                We live in a big data world. Our lives are
                not just governed by data, they are data
                The biggest databases in the world are
                private not governmental – social
                networks, search engines, finance
                companies, supermarkets...
                These are enriched by public data that       Getting worse
                are only available to purchase                  in USA
* This quote has also been attributed to Ralph Nader
the journalism argument
Good journalism = data
journalism
Good journalism = data
journalism
But the data is complex
Good journalism = data
journalism
But the data is complex
Split across multiple datasets
Good journalism = data
journalism
But the data is complex
Split across multiple datasets
The links are not clear and often redacted
Good journalism = data
journalism
But the data is complex
Split across multiple datasets
The links are not clear and often redacted
And without both access and the right to reuse the
data, you can’t even begin to make sense of it
Good journalism = data
journalism
But the data is complex
Split across multiple datasets
The links are not clear and often redacted
And without both access and the right to reuse the
data, you can’t even begin to make sense of it
Even then, it’s HARD
Good journalism = data
journalism
But the data is complex
Split across multiple datasets
The links are not clear and often redacted
And without both access and the right to reuse the
data, you can’t even begin to make sense of it
Even then, it’s HARD
But that’s why it’s worth doing
In short, this only works:
In short, this only works:
     SELECT companies.* FROM companies
      INNER JOIN government_suppliers ON
companies.id = government_suppliers.company_id
    INNER JOIN directors ON companies.id =
         directors.company_id WHERE
 government_suppliers.total_received > 5000000
   AND directors.convicted_tax_evader = true
In short, this only works:
     SELECT companies.* FROM companies
      INNER JOIN government_suppliers ON
companies.id = government_suppliers.company_id
    INNER JOIN directors ON companies.id =
         directors.company_id WHERE
 government_suppliers.total_received > 5000000
   AND directors.convicted_tax_evader = true

     if you can get the data

More Related Content

Similar to Open Data For Journalists : How it works, why it matters

Big Data in the Legal Industry
Big Data in the Legal IndustryBig Data in the Legal Industry
Big Data in the Legal IndustryEvolve Law
 
Big Data, Republicans and 2016
Big Data, Republicans and 2016Big Data, Republicans and 2016
Big Data, Republicans and 2016steveparkhurst
 
Big data introduction by quontra solutions
Big data introduction by quontra solutionsBig data introduction by quontra solutions
Big data introduction by quontra solutionsQUONTRASOLUTIONS
 
The Essential Data Ingredient
The Essential Data IngredientThe Essential Data Ingredient
The Essential Data IngredientRich Cooper
 
Does government matter in a digital world?
Does government matter in a digital world?Does government matter in a digital world?
Does government matter in a digital world?Ania Karzek
 
23 ijcse-01238-1indhunisha
23 ijcse-01238-1indhunisha23 ijcse-01238-1indhunisha
23 ijcse-01238-1indhunishaShivlal Mewada
 
Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012Chris Taggart
 
Politics and social media
Politics and social mediaPolitics and social media
Politics and social mediaCeriHughes9
 
Invasion Of Privacy In Canadian Media
Invasion Of Privacy In Canadian MediaInvasion Of Privacy In Canadian Media
Invasion Of Privacy In Canadian MediaKelly Ratkovic
 
Dull, Difficult, and Essential: Managing Public Records
Dull,  Difficult,  and Essential: Managing Public RecordsDull,  Difficult,  and Essential: Managing Public Records
Dull, Difficult, and Essential: Managing Public RecordsPaul W. Taylor
 
Policy primer net303 study period 3, 2017
Policy primer net303  study period 3, 2017Policy primer net303  study period 3, 2017
Policy primer net303 study period 3, 2017Steve Mckee
 
Notes from the Observation Deck // A Data Revolution
Notes from the Observation Deck // A Data Revolution Notes from the Observation Deck // A Data Revolution
Notes from the Observation Deck // A Data Revolution gngeorge
 
How to get open data into the hands of activists
How to get open data into the hands of activistsHow to get open data into the hands of activists
How to get open data into the hands of activistsAslam Khan
 
Datamarket: A Start-Up that will Change the World (with Open Data)
Datamarket: A Start-Up that will Change the World (with Open Data)Datamarket: A Start-Up that will Change the World (with Open Data)
Datamarket: A Start-Up that will Change the World (with Open Data)Bo Olafsson
 
Developers can Change The World
Developers can Change The WorldDevelopers can Change The World
Developers can Change The Worldjamesturk
 
Future of value of data singapore.compressed
Future of value of data   singapore.compressedFuture of value of data   singapore.compressed
Future of value of data singapore.compressedFuture Agenda
 
Future of data - Insights from Discussions Building on an Initial Perspective...
Future of data - Insights from Discussions Building on an Initial Perspective...Future of data - Insights from Discussions Building on an Initial Perspective...
Future of data - Insights from Discussions Building on an Initial Perspective...Future Agenda
 
Big Data and the Future of Money 2014
Big Data and the Future of Money 2014Big Data and the Future of Money 2014
Big Data and the Future of Money 2014Daniel Austin
 

Similar to Open Data For Journalists : How it works, why it matters (20)

Big Data in the Legal Industry
Big Data in the Legal IndustryBig Data in the Legal Industry
Big Data in the Legal Industry
 
Big Data, Republicans and 2016
Big Data, Republicans and 2016Big Data, Republicans and 2016
Big Data, Republicans and 2016
 
Big data introduction by quontra solutions
Big data introduction by quontra solutionsBig data introduction by quontra solutions
Big data introduction by quontra solutions
 
The Essential Data Ingredient
The Essential Data IngredientThe Essential Data Ingredient
The Essential Data Ingredient
 
Does government matter in a digital world?
Does government matter in a digital world?Does government matter in a digital world?
Does government matter in a digital world?
 
23 ijcse-01238-1indhunisha
23 ijcse-01238-1indhunisha23 ijcse-01238-1indhunisha
23 ijcse-01238-1indhunisha
 
Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012
 
Politics and social media
Politics and social mediaPolitics and social media
Politics and social media
 
Invasion Of Privacy In Canadian Media
Invasion Of Privacy In Canadian MediaInvasion Of Privacy In Canadian Media
Invasion Of Privacy In Canadian Media
 
Dull, Difficult, and Essential: Managing Public Records
Dull,  Difficult,  and Essential: Managing Public RecordsDull,  Difficult,  and Essential: Managing Public Records
Dull, Difficult, and Essential: Managing Public Records
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Policy primer net303 study period 3, 2017
Policy primer net303  study period 3, 2017Policy primer net303  study period 3, 2017
Policy primer net303 study period 3, 2017
 
Notes from the Observation Deck // A Data Revolution
Notes from the Observation Deck // A Data Revolution Notes from the Observation Deck // A Data Revolution
Notes from the Observation Deck // A Data Revolution
 
The #BigData Dilemna
The #BigData Dilemna The #BigData Dilemna
The #BigData Dilemna
 
How to get open data into the hands of activists
How to get open data into the hands of activistsHow to get open data into the hands of activists
How to get open data into the hands of activists
 
Datamarket: A Start-Up that will Change the World (with Open Data)
Datamarket: A Start-Up that will Change the World (with Open Data)Datamarket: A Start-Up that will Change the World (with Open Data)
Datamarket: A Start-Up that will Change the World (with Open Data)
 
Developers can Change The World
Developers can Change The WorldDevelopers can Change The World
Developers can Change The World
 
Future of value of data singapore.compressed
Future of value of data   singapore.compressedFuture of value of data   singapore.compressed
Future of value of data singapore.compressed
 
Future of data - Insights from Discussions Building on an Initial Perspective...
Future of data - Insights from Discussions Building on an Initial Perspective...Future of data - Insights from Discussions Building on an Initial Perspective...
Future of data - Insights from Discussions Building on an Initial Perspective...
 
Big Data and the Future of Money 2014
Big Data and the Future of Money 2014Big Data and the Future of Money 2014
Big Data and the Future of Money 2014
 

More from Chris Taggart

Open Corporate Data: not just good, better
Open Corporate Data: not just good, betterOpen Corporate Data: not just good, better
Open Corporate Data: not just good, betterChris Taggart
 
Understanding corporate networks the open data way
Understanding corporate networks the open data wayUnderstanding corporate networks the open data way
Understanding corporate networks the open data wayChris Taggart
 
Corruption, corporate transparency and open data
Corruption, corporate transparency and open dataCorruption, corporate transparency and open data
Corruption, corporate transparency and open dataChris Taggart
 
How The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The FutureHow The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The FutureChris Taggart
 
Isle of Man open data overview
Isle of Man open data overviewIsle of Man open data overview
Isle of Man open data overviewChris Taggart
 
OpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UKOpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UKChris Taggart
 
Open local data: challenges and opportunities
Open local data: challenges and opportunitiesOpen local data: challenges and opportunities
Open local data: challenges and opportunitiesChris Taggart
 
Open Data & The Rewards of Failure
Open Data & The Rewards of FailureOpen Data & The Rewards of Failure
Open Data & The Rewards of FailureChris Taggart
 
Open Local Data Presentation
Open Local Data PresentationOpen Local Data Presentation
Open Local Data PresentationChris Taggart
 
Opening up local government data: APPSI Presentation
Opening up local government data: APPSI PresentationOpening up local government data: APPSI Presentation
Opening up local government data: APPSI PresentationChris Taggart
 

More from Chris Taggart (10)

Open Corporate Data: not just good, better
Open Corporate Data: not just good, betterOpen Corporate Data: not just good, better
Open Corporate Data: not just good, better
 
Understanding corporate networks the open data way
Understanding corporate networks the open data wayUnderstanding corporate networks the open data way
Understanding corporate networks the open data way
 
Corruption, corporate transparency and open data
Corruption, corporate transparency and open dataCorruption, corporate transparency and open data
Corruption, corporate transparency and open data
 
How The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The FutureHow The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The Future
 
Isle of Man open data overview
Isle of Man open data overviewIsle of Man open data overview
Isle of Man open data overview
 
OpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UKOpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UK
 
Open local data: challenges and opportunities
Open local data: challenges and opportunitiesOpen local data: challenges and opportunities
Open local data: challenges and opportunities
 
Open Data & The Rewards of Failure
Open Data & The Rewards of FailureOpen Data & The Rewards of Failure
Open Data & The Rewards of Failure
 
Open Local Data Presentation
Open Local Data PresentationOpen Local Data Presentation
Open Local Data Presentation
 
Opening up local government data: APPSI Presentation
Opening up local government data: APPSI PresentationOpening up local government data: APPSI Presentation
Opening up local government data: APPSI Presentation
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Open Data For Journalists : How it works, why it matters

  • 1. Open data for journalists How it’s useful, why it matters Chris Taggart, OpenCorporates, NICAR, Feb 2012
  • 3. About OpenCorporates tions isdic tes 7 jur nies in 4 22 US sta compa clud ing 6million In wov er 3 No
  • 4. A simple (but huge) goal: an entry for every corporate legal entity in the world Based on the company number and jurisdiction (no monopoly id)
  • 5. A simple (but huge) goal: an entry for every corporate legal entity in the world Based on the company number and jurisdiction (no monopoly id)
  • 6. A simple (but huge) goal: an entry for every corporate legal entity in the world Based on the company number and jurisdiction (no monopoly id)
  • 8. The simple search Not to be underestimated
  • 9. The simple search Not to be underestimated
  • 10. The simple search Not to be underestimated Massively reduces friction (how long will it take you to find and search multiple jurisdictions)
  • 11. The simple search Not to be underestimated Massively reduces friction (how long will it take you to find and search multiple jurisdictions)
  • 12. The simple search Not to be underestimated Massively reduces friction (how long will it take you to find and search multiple jurisdictions) Allows what if questions
  • 13. The simple search Not to be underestimated Massively reduces friction (how long will it take you to find and search multiple jurisdictions) Allows what if questions Potentially generates stories in its own right
  • 14. The simple search Not to be underestimated Massively reduces friction (how long will it take you to find and search multiple jurisdictions) Allows what if questions Potentially generates stories in its own right
  • 16. Source for additional info Addresses, filings, status, websites...
  • 17. Source for additional info Addresses, filings, status, websites...
  • 18. Source for additional info Addresses, filings, status, websites... Intl trademarks, UK govt spending, official notices, health & safety...
  • 19. Source for additional info Addresses, filings, status, websites... Intl trademarks, UK govt spending, official notices, health & safety...
  • 20. Source for additional info Addresses, filings, status, websites... Intl trademarks, UK govt spending, official notices, health & safety... Other IDs: SEC, CAGE, charity....
  • 21. Source for additional info Addresses, filings, status, websites... Intl trademarks, UK govt spending, official notices, health & safety... Other IDs: SEC, CAGE, charity.... Coming soon: lobbying registers
  • 22. Reconciliation (matching names to legal entities) Cleans up messy company names (& previous names) to legal entity, and from there to other data
  • 23. The database/platform API: allows all information to be retrieved as data, even searches
  • 24. Why care about open data?
  • 26. Information is the currency of democracy Thomas Jefferson* * This quote has also been attributed to Ralph Nader
  • 27. ATA is the currency Information D of democracy Thomas Jefferson* * This quote has also been attributed to Ralph Nader
  • 28. ATA is the currency Information D of democracy Thomas Jefferson* We live in a big data world. Our lives are not just governed by data, they are data * This quote has also been attributed to Ralph Nader
  • 29. ATA is the currency Information D of democracy Thomas Jefferson* We live in a big data world. Our lives are not just governed by data, they are data The biggest databases in the world are private not governmental – social networks, search engines, finance companies, supermarkets... * This quote has also been attributed to Ralph Nader
  • 30. ATA is the currency Information D of democracy Thomas Jefferson* We live in a big data world. Our lives are not just governed by data, they are data The biggest databases in the world are private not governmental – social networks, search engines, finance companies, supermarkets... These are enriched by public data that are only available to purchase * This quote has also been attributed to Ralph Nader
  • 31. ATA is the currency Information D of democracy Thomas Jefferson* We live in a big data world. Our lives are not just governed by data, they are data The biggest databases in the world are private not governmental – social networks, search engines, finance companies, supermarkets... These are enriched by public data that Getting worse are only available to purchase in USA * This quote has also been attributed to Ralph Nader
  • 33. Good journalism = data journalism
  • 34. Good journalism = data journalism But the data is complex
  • 35. Good journalism = data journalism But the data is complex Split across multiple datasets
  • 36. Good journalism = data journalism But the data is complex Split across multiple datasets The links are not clear and often redacted
  • 37. Good journalism = data journalism But the data is complex Split across multiple datasets The links are not clear and often redacted And without both access and the right to reuse the data, you can’t even begin to make sense of it
  • 38. Good journalism = data journalism But the data is complex Split across multiple datasets The links are not clear and often redacted And without both access and the right to reuse the data, you can’t even begin to make sense of it Even then, it’s HARD
  • 39. Good journalism = data journalism But the data is complex Split across multiple datasets The links are not clear and often redacted And without both access and the right to reuse the data, you can’t even begin to make sense of it Even then, it’s HARD But that’s why it’s worth doing
  • 40.
  • 41. In short, this only works:
  • 42. In short, this only works: SELECT companies.* FROM companies INNER JOIN government_suppliers ON companies.id = government_suppliers.company_id INNER JOIN directors ON companies.id = directors.company_id WHERE government_suppliers.total_received > 5000000 AND directors.convicted_tax_evader = true
  • 43. In short, this only works: SELECT companies.* FROM companies INNER JOIN government_suppliers ON companies.id = government_suppliers.company_id INNER JOIN directors ON companies.id = directors.company_id WHERE government_suppliers.total_received > 5000000 AND directors.convicted_tax_evader = true if you can get the data

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n