SlideShare ist ein Scribd-Unternehmen logo
1 von 46
Downloaden Sie, um offline zu lesen
Open Data
Not Just Good. Better
Open Data is Good!
http://www.flickr.com/photos/stolidsoul/433129708/sizes/o/in/photostream/
But we’re not the ones
we need to convince
http://okfestival.org/open-government-data-camp/
Most people don’t
care about ‘open’
http://www.flickr.com/photos/erlin1/9312646298/sizes/l/in/photostream/
Even though open
data is better
(than closed/proprietary)
Even though open
data is better
(than closed/proprietary)
• Better for innovation
Even though open
data is better
(than closed/proprietary)
• Better for innovation
• Better for competition
Even though open
data is better
(than closed/proprietary)
• Better for innovation
• Better for competition
• Better for efficiency
Even though open
data is better
(than closed/proprietary)
• Better for innovation
• Better for competition
• Better for efficiency
• Better for sharing (esp cross-
organisation or cross-border)
But open has a secret
weapon
http://www.flickr.com/photos/x-ray_delta_one/8493335701/sizes/l/in/photostream/
It’s better quality too
http://www.flickr.com/photos/infusionsoft/4484373179/sizes/l/in/photostream/
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
Problem Cause
Data accuracy
Data is re-keyed. Few eyeballs.
Often little downside to lying
Gaps in data
High (& often duplicated) cost of
data entry. Limited to payers
Lack of granularity
Legacy systems/data models hard
to reengineer in closed world
Errors go uncorrected Few feedback mechanisms
Black box/No
provenance
Can’t reveal (sometimes dubious)
sources. Limits usefulness/trust
Isolated
Proprietary IDs are internal
identifiers & are barriers to
sharing & improved data quality
Common proprietary
data quality issues
A concrete example:
corporate networks
Hugely important
(and valuable)
• The dataset we need to understand
the corporate world
• Who we (or the government) is really
doing business with
• Political influence/donations/lobbying
• Tax/resource extraction
• Corporate Governance
• Credit risk
But proprietary datasets
on this are problematic
• Expensive, so relatively few users
• Huge gaps in data
• Uses proprietary IDs (so not clear
what it’s refers to)
• Restrictive licences
• Opaque – no info re calculations,
provenance or confidence
But proprietary datasets
on this are problematic
• Expensive, so relatively few users
• Huge gaps in data
• Uses proprietary IDs (so not clear
what it’s refers to)
• Restrictive licences
• Opaque – no info re calculations,
provenance or confidence
Result: low-quality data
The open data
alternative
The open data
alternative
Enabled by a grant from the
Alfred P Sloan Foundation
Data from disparate
public sources
finding
new
insights
no such
company
...and
errorstoo
no such
company
What a modern financial
company looks like (highly simplified
& truncated views)
What a modern financial
company looks like (highly simplified
& truncated views)
What a modern financial
company looks like (highly simplified
& truncated views)
What a modern financial
company looks like (highly simplified
& truncated views)
private
unlimited
company
Crowd-sourcing?
Ninja-sourcing!
http://www.flickr.com/photos/danielygo/5531024732/sizes/l/in/photostream/
The company that wants to know
your network... every friend...
every interaction
http://www.flickr.com/photos/jeffmcneill/5260815552/sizes/l/
why bother?
Facebook, Inc
This is what we got from
their SEC filings as text
Facebook, Inc
(and turned into data)
This is what we got from
their SEC filings as text
Facebook, Inc
Pinnacle Sweden AB
Vitesse LLC
Facebook Operations LLC
Facebook Ireland Limited
Edge Network Services Limited
Andale Acquisition Corp
(and turned into data)
This is what we got from
their SEC filings as text
Facebook Ireland Limited
Edge Network Services Limited
Pinnacle Sweden AB
Vitesse LLC
Facebook Operations LLC
Andale Acquisition Corp
Then we started
investigating
Facebook, Inc
Facebook Ireland Limited
Edge Network Services Limited
Then we started
investigating
Facebook, Inc
Facebook, Inc
Facebook Ireland Limited Edge Network Services Limited
Facebook, Inc
Facebook Ireland Limited Edge Network Services Limited
Facebook Cayman
Holdings Unlimited
IV
Facebook Cayman
Holdings Unlimited II
Facebook Cayman
Holdings Unlimited lll
Facebook Ireland Holdings
Randomus Investments Limited
Facebook International
Holdings II Ltd
Facebook International
Holdings I Ltd
Facebook Cayman
Holdings Unlimited I
Want to help?
jobs@opencorporates.com
investigators@opencorporates.com

Weitere ähnliche Inhalte

Ähnlich wie Open Corporate Data: not just good, better

10 Decisions You Will Face With Any Donor Data Migration Project
10 Decisions You Will Face With Any Donor Data Migration Project10 Decisions You Will Face With Any Donor Data Migration Project
10 Decisions You Will Face With Any Donor Data Migration ProjectBloomerang
 
2018 10 igneous
2018 10 igneous2018 10 igneous
2018 10 igneousChris Dwan
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryWolfgang G. Hoeck
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackPrecisely
 
Will Bigger and Better Data Help Deliver More Major Donors?
Will Bigger and Better Data Help Deliver More Major Donors?Will Bigger and Better Data Help Deliver More Major Donors?
Will Bigger and Better Data Help Deliver More Major Donors?Azadi Sheridan
 
Data Is Eating The World
Data Is Eating The WorldData Is Eating The World
Data Is Eating The WorldUday Kumar
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profilingShailja Khurana
 
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision MakingFast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision MakingCodemotion
 
Data Management 101 (2015)
Data Management 101 (2015)Data Management 101 (2015)
Data Management 101 (2015)Kristin Briney
 
GDPR for Things - ThingsCon Amsterdam 2017
GDPR for Things - ThingsCon Amsterdam 2017GDPR for Things - ThingsCon Amsterdam 2017
GDPR for Things - ThingsCon Amsterdam 2017Saskia Videler
 
Big Data for Small Businesses
Big Data for Small BusinessesBig Data for Small Businesses
Big Data for Small BusinessesVivastream
 
Managing the Challenges to Open Data
Managing the Challenges to Open DataManaging the Challenges to Open Data
Managing the Challenges to Open Dataenotsluap
 
Your're Special (But Not That Special)
Your're Special (But Not That Special)Your're Special (But Not That Special)
Your're Special (But Not That Special)Sandra (Sandy) Dunn
 
Dama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a DatabaseDama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a Databasejohanswart1234
 
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...Brandon Fix
 

Ähnlich wie Open Corporate Data: not just good, better (20)

10 Decisions You Will Face With Any Donor Data Migration Project
10 Decisions You Will Face With Any Donor Data Migration Project10 Decisions You Will Face With Any Donor Data Migration Project
10 Decisions You Will Face With Any Donor Data Migration Project
 
2018 10 igneous
2018 10 igneous2018 10 igneous
2018 10 igneous
 
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge DiscoveryBioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
BioIT 2017 - Ontoforce and Amgen Gene Knowledge Discovery
 
Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016Workshop - finding and accessing data - Cambridge August 22 2016
Workshop - finding and accessing data - Cambridge August 22 2016
 
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on TrackYour AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
Your AI and ML Projects Are Failing – Key Steps to Get Them Back on Track
 
Will Bigger and Better Data Help Deliver More Major Donors?
Will Bigger and Better Data Help Deliver More Major Donors?Will Bigger and Better Data Help Deliver More Major Donors?
Will Bigger and Better Data Help Deliver More Major Donors?
 
Data Is Eating The World
Data Is Eating The WorldData Is Eating The World
Data Is Eating The World
 
Benefits of Data
Benefits of DataBenefits of Data
Benefits of Data
 
Data quality and data profiling
Data quality and data profilingData quality and data profiling
Data quality and data profiling
 
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision MakingFast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
 
Data Management 101 (2015)
Data Management 101 (2015)Data Management 101 (2015)
Data Management 101 (2015)
 
IDOL presentation
IDOL presentationIDOL presentation
IDOL presentation
 
GDPR for Things - ThingsCon Amsterdam 2017
GDPR for Things - ThingsCon Amsterdam 2017GDPR for Things - ThingsCon Amsterdam 2017
GDPR for Things - ThingsCon Amsterdam 2017
 
Big Data for Small Businesses
Big Data for Small BusinessesBig Data for Small Businesses
Big Data for Small Businesses
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
 
Data Management 101
Data Management 101Data Management 101
Data Management 101
 
Managing the Challenges to Open Data
Managing the Challenges to Open DataManaging the Challenges to Open Data
Managing the Challenges to Open Data
 
Your're Special (But Not That Special)
Your're Special (But Not That Special)Your're Special (But Not That Special)
Your're Special (But Not That Special)
 
Dama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a DatabaseDama - Protecting Sensitive Data on a Database
Dama - Protecting Sensitive Data on a Database
 
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
 

Mehr von Chris Taggart

Understanding corporate networks the open data way
Understanding corporate networks the open data wayUnderstanding corporate networks the open data way
Understanding corporate networks the open data wayChris Taggart
 
Corruption, corporate transparency and open data
Corruption, corporate transparency and open dataCorruption, corporate transparency and open data
Corruption, corporate transparency and open dataChris Taggart
 
The Closed World Of Company Data
The Closed World Of Company DataThe Closed World Of Company Data
The Closed World Of Company DataChris Taggart
 
Open Data For Journalists : How it works, why it matters
Open Data For Journalists : How it works, why it mattersOpen Data For Journalists : How it works, why it matters
Open Data For Journalists : How it works, why it mattersChris Taggart
 
Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012Chris Taggart
 
How The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The FutureHow The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The FutureChris Taggart
 
Open Global Data: A Threat Or Saviour For Democracy
Open Global Data: A Threat Or Saviour For DemocracyOpen Global Data: A Threat Or Saviour For Democracy
Open Global Data: A Threat Or Saviour For DemocracyChris Taggart
 
Isle of Man open data overview
Isle of Man open data overviewIsle of Man open data overview
Isle of Man open data overviewChris Taggart
 
OpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UKOpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UKChris Taggart
 
The good (and bad) news about open data
The good (and bad) news about open dataThe good (and bad) news about open data
The good (and bad) news about open dataChris Taggart
 
Can Open Data Save The Public Realm
Can Open Data Save The Public RealmCan Open Data Save The Public Realm
Can Open Data Save The Public RealmChris Taggart
 
Open local data: challenges and opportunities
Open local data: challenges and opportunitiesOpen local data: challenges and opportunities
Open local data: challenges and opportunitiesChris Taggart
 
News rewired presentation
News rewired presentationNews rewired presentation
News rewired presentationChris Taggart
 
Open Data & The Rewards of Failure
Open Data & The Rewards of FailureOpen Data & The Rewards of Failure
Open Data & The Rewards of FailureChris Taggart
 
Open local data presentation for okcon
Open local data presentation for okconOpen local data presentation for okcon
Open local data presentation for okconChris Taggart
 
Open Local Data Presentation
Open Local Data PresentationOpen Local Data Presentation
Open Local Data PresentationChris Taggart
 
Opening up local government data: APPSI Presentation
Opening up local government data: APPSI PresentationOpening up local government data: APPSI Presentation
Opening up local government data: APPSI PresentationChris Taggart
 

Mehr von Chris Taggart (17)

Understanding corporate networks the open data way
Understanding corporate networks the open data wayUnderstanding corporate networks the open data way
Understanding corporate networks the open data way
 
Corruption, corporate transparency and open data
Corruption, corporate transparency and open dataCorruption, corporate transparency and open data
Corruption, corporate transparency and open data
 
The Closed World Of Company Data
The Closed World Of Company DataThe Closed World Of Company Data
The Closed World Of Company Data
 
Open Data For Journalists : How it works, why it matters
Open Data For Journalists : How it works, why it mattersOpen Data For Journalists : How it works, why it matters
Open Data For Journalists : How it works, why it matters
 
Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012
 
How The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The FutureHow The Open Data Community Died - A Warning From The Future
How The Open Data Community Died - A Warning From The Future
 
Open Global Data: A Threat Or Saviour For Democracy
Open Global Data: A Threat Or Saviour For DemocracyOpen Global Data: A Threat Or Saviour For Democracy
Open Global Data: A Threat Or Saviour For Democracy
 
Isle of Man open data overview
Isle of Man open data overviewIsle of Man open data overview
Isle of Man open data overview
 
OpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UKOpenlyLocal & Open Local Data in the UK
OpenlyLocal & Open Local Data in the UK
 
The good (and bad) news about open data
The good (and bad) news about open dataThe good (and bad) news about open data
The good (and bad) news about open data
 
Can Open Data Save The Public Realm
Can Open Data Save The Public RealmCan Open Data Save The Public Realm
Can Open Data Save The Public Realm
 
Open local data: challenges and opportunities
Open local data: challenges and opportunitiesOpen local data: challenges and opportunities
Open local data: challenges and opportunities
 
News rewired presentation
News rewired presentationNews rewired presentation
News rewired presentation
 
Open Data & The Rewards of Failure
Open Data & The Rewards of FailureOpen Data & The Rewards of Failure
Open Data & The Rewards of Failure
 
Open local data presentation for okcon
Open local data presentation for okconOpen local data presentation for okcon
Open local data presentation for okcon
 
Open Local Data Presentation
Open Local Data PresentationOpen Local Data Presentation
Open Local Data Presentation
 
Opening up local government data: APPSI Presentation
Opening up local government data: APPSI PresentationOpening up local government data: APPSI Presentation
Opening up local government data: APPSI Presentation
 

Kürzlich hochgeladen

57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdf57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdfGerald Furnkranz
 
Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.NaveedKhaskheli1
 
16042024_First India Newspaper Jaipur.pdf
16042024_First India Newspaper Jaipur.pdf16042024_First India Newspaper Jaipur.pdf
16042024_First India Newspaper Jaipur.pdfFIRST INDIA
 
Experience the Future of the Web3 Gaming Trend
Experience the Future of the Web3 Gaming TrendExperience the Future of the Web3 Gaming Trend
Experience the Future of the Web3 Gaming TrendFabwelt
 
IndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global NewsIndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global NewsIndiaWest2
 
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkcomplaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkbhavenpr
 
Rohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for JusticeRohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for JusticeAbdulGhani778830
 
15042024_First India Newspaper Jaipur.pdf
15042024_First India Newspaper Jaipur.pdf15042024_First India Newspaper Jaipur.pdf
15042024_First India Newspaper Jaipur.pdfFIRST INDIA
 

Kürzlich hochgeladen (8)

57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdf57 Bidens Annihilation Nation Policy.pdf
57 Bidens Annihilation Nation Policy.pdf
 
Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.Global Terrorism and its types and prevention ppt.
Global Terrorism and its types and prevention ppt.
 
16042024_First India Newspaper Jaipur.pdf
16042024_First India Newspaper Jaipur.pdf16042024_First India Newspaper Jaipur.pdf
16042024_First India Newspaper Jaipur.pdf
 
Experience the Future of the Web3 Gaming Trend
Experience the Future of the Web3 Gaming TrendExperience the Future of the Web3 Gaming Trend
Experience the Future of the Web3 Gaming Trend
 
IndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global NewsIndiaWest: Your Trusted Source for Today's Global News
IndiaWest: Your Trusted Source for Today's Global News
 
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfkcomplaint-ECI-PM-media-1-Chandru.pdfra;;prfk
complaint-ECI-PM-media-1-Chandru.pdfra;;prfk
 
Rohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for JusticeRohan Jaitley: Central Gov't Standing Counsel for Justice
Rohan Jaitley: Central Gov't Standing Counsel for Justice
 
15042024_First India Newspaper Jaipur.pdf
15042024_First India Newspaper Jaipur.pdf15042024_First India Newspaper Jaipur.pdf
15042024_First India Newspaper Jaipur.pdf
 

Open Corporate Data: not just good, better

  • 1. Open Data Not Just Good. Better
  • 2. Open Data is Good! http://www.flickr.com/photos/stolidsoul/433129708/sizes/o/in/photostream/
  • 3. But we’re not the ones we need to convince http://okfestival.org/open-government-data-camp/
  • 4. Most people don’t care about ‘open’ http://www.flickr.com/photos/erlin1/9312646298/sizes/l/in/photostream/
  • 5. Even though open data is better (than closed/proprietary)
  • 6. Even though open data is better (than closed/proprietary) • Better for innovation
  • 7. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition
  • 8. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition • Better for efficiency
  • 9. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition • Better for efficiency • Better for sharing (esp cross- organisation or cross-border)
  • 10. But open has a secret weapon http://www.flickr.com/photos/x-ray_delta_one/8493335701/sizes/l/in/photostream/
  • 11. It’s better quality too http://www.flickr.com/photos/infusionsoft/4484373179/sizes/l/in/photostream/
  • 12. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 13. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 14. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 15. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 16. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 17. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 18. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  • 20. Hugely important (and valuable) • The dataset we need to understand the corporate world • Who we (or the government) is really doing business with • Political influence/donations/lobbying • Tax/resource extraction • Corporate Governance • Credit risk
  • 21. But proprietary datasets on this are problematic • Expensive, so relatively few users • Huge gaps in data • Uses proprietary IDs (so not clear what it’s refers to) • Restrictive licences • Opaque – no info re calculations, provenance or confidence
  • 22. But proprietary datasets on this are problematic • Expensive, so relatively few users • Huge gaps in data • Uses proprietary IDs (so not clear what it’s refers to) • Restrictive licences • Opaque – no info re calculations, provenance or confidence Result: low-quality data
  • 24. The open data alternative Enabled by a grant from the Alfred P Sloan Foundation
  • 26.
  • 30. What a modern financial company looks like (highly simplified & truncated views)
  • 31. What a modern financial company looks like (highly simplified & truncated views)
  • 32. What a modern financial company looks like (highly simplified & truncated views)
  • 33. What a modern financial company looks like (highly simplified & truncated views) private unlimited company
  • 36.
  • 37.
  • 38. The company that wants to know your network... every friend... every interaction http://www.flickr.com/photos/jeffmcneill/5260815552/sizes/l/ why bother?
  • 39. Facebook, Inc This is what we got from their SEC filings as text
  • 40. Facebook, Inc (and turned into data) This is what we got from their SEC filings as text
  • 41. Facebook, Inc Pinnacle Sweden AB Vitesse LLC Facebook Operations LLC Facebook Ireland Limited Edge Network Services Limited Andale Acquisition Corp (and turned into data) This is what we got from their SEC filings as text
  • 42. Facebook Ireland Limited Edge Network Services Limited Pinnacle Sweden AB Vitesse LLC Facebook Operations LLC Andale Acquisition Corp Then we started investigating Facebook, Inc
  • 43. Facebook Ireland Limited Edge Network Services Limited Then we started investigating Facebook, Inc
  • 44. Facebook, Inc Facebook Ireland Limited Edge Network Services Limited
  • 45. Facebook, Inc Facebook Ireland Limited Edge Network Services Limited Facebook Cayman Holdings Unlimited IV Facebook Cayman Holdings Unlimited II Facebook Cayman Holdings Unlimited lll Facebook Ireland Holdings Randomus Investments Limited Facebook International Holdings II Ltd Facebook International Holdings I Ltd Facebook Cayman Holdings Unlimited I