SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
Data All the Way Down
Jeni Tennison
@JeniT
http://www.jenitennison.com/blog/
Data All the Way Down
• challenges of complex open data
• layered approach to data publishing
• essential steps
• benefits
Complex Datasets
• too much for a single spreadsheet
• need to navigate
 • browse through data
 • look at slices of larger dataset
 • view summary statistics

• need to explain
 • definitions of terms, provisos & disclaimers
User Challenge
• complex data sets have range of users
 • different hardware / platforms
 • different tasks / goals
 • different ability / understanding

• no one interface satisfies everyone
• data owners cannot satisfy everyone
• create ecosystem around open data
visualisation / data gap   end user vs reuser
Visualisations
• approachable for real people
• necessary for stakeholder buy-in
• beauty is in what's left out
 • advertisement or taster of rich datasets
 • often not possible in official data

• leaves questions unanswered
 • what if we looked at the data in a different way?
Raw Data
• importable into own data store
 • often only interested in particular slice
 • data set may be massive / changing

• run whatever analysis you want
 • requires at least some programming skills
 • analysis might not be appropriate for the data

• documentation probably lacking
bridging the gap                         layered data access

Photo by Nikita Kravchuk http://www.flickr.com/photos/mi55er/3845619153/
Layered Architecture
• user interface
 • navigation and global understanding

• API
 • curated, targeted, programmable access

• query
 • free-form programmable access

• raw data
legislation.gov.uk   lists as Atom feeds
legislation.gov.uk   content as XML
legislation.gov.uk   layer other views
organograms   navigable visualisation
organograms   JSON data
organograms   RDF / XML / HTML
organograms   SPARQL query
organograms   raw data
Key Techniques
• resource-driven design (good URIs)
• every page built based on API calls
• explicit links to API access
 • for bonus points, link to your transformation code

• consistent terminology
 • clear mapping from UI to API

• caching & access control at each level
Benefits
• fork at any point
 • don't like the visualisation / API? create your own!

• everyone is human
 • reusers gain understanding from user interface

• visualisation benefits the stack
 • API oriented towards achieving a goal
 • visual validation of data improves quality
Questions?

Weitere ähnliche Inhalte

Was ist angesagt? (7)

Google App Engine - exploiting limitations
Google App Engine - exploiting limitationsGoogle App Engine - exploiting limitations
Google App Engine - exploiting limitations
 
Alfresco Day Vienna 2015 - Technical Track - Extending Share: Real world exam...
Alfresco Day Vienna 2015 - Technical Track - Extending Share: Real world exam...Alfresco Day Vienna 2015 - Technical Track - Extending Share: Real world exam...
Alfresco Day Vienna 2015 - Technical Track - Extending Share: Real world exam...
 
10x10 on <link />
10x10 on <link />10x10 on <link />
10x10 on <link />
 
Rest api
Rest apiRest api
Rest api
 
Web Scraping Technologies
Web Scraping TechnologiesWeb Scraping Technologies
Web Scraping Technologies
 
Alfresco Day Vienna 2015 - Technical Track - REST API of the Future
Alfresco Day Vienna 2015 - Technical Track - REST API of the FutureAlfresco Day Vienna 2015 - Technical Track - REST API of the Future
Alfresco Day Vienna 2015 - Technical Track - REST API of the Future
 
Troubleshooting Exchange Hybrid Deployments
Troubleshooting Exchange Hybrid DeploymentsTroubleshooting Exchange Hybrid Deployments
Troubleshooting Exchange Hybrid Deployments
 

Andere mochten auch (6)

IBES Health and biomedical informatics
IBES  Health and biomedical informaticsIBES  Health and biomedical informatics
IBES Health and biomedical informatics
 
Granada0611 digital humanities
Granada0611 digital humanitiesGranada0611 digital humanities
Granada0611 digital humanities
 
Open Data: Dreams to Reality
Open Data: Dreams to RealityOpen Data: Dreams to Reality
Open Data: Dreams to Reality
 
Semantic Web and RDF: Can we reach escape velocity?
Semantic Web and RDF: Can we reach escape velocity?Semantic Web and RDF: Can we reach escape velocity?
Semantic Web and RDF: Can we reach escape velocity?
 
Blogcomments
BlogcommentsBlogcomments
Blogcomments
 
Porting terminologies to the Semantic Web
Porting terminologies to the Semantic WebPorting terminologies to the Semantic Web
Porting terminologies to the Semantic Web
 

Ähnlich wie Data All the Way Down

Design Reviews for Operations - Velocity Europe 2014
Design Reviews for Operations - Velocity Europe 2014Design Reviews for Operations - Velocity Europe 2014
Design Reviews for Operations - Velocity Europe 2014
Mandi Walls
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
Peter Haase
 

Ähnlich wie Data All the Way Down (20)

Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Presto Summit 2018 - 02 - LinkedIn
Presto Summit 2018  - 02 - LinkedInPresto Summit 2018  - 02 - LinkedIn
Presto Summit 2018 - 02 - LinkedIn
 
Architecture Principles CodeStock
Architecture Principles CodeStock Architecture Principles CodeStock
Architecture Principles CodeStock
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
 
Fusion 3 Overview Webinar
Fusion 3 Overview Webinar Fusion 3 Overview Webinar
Fusion 3 Overview Webinar
 
How to Manage and Troubleshoot Search: A Practical Guide
How to Manage and Troubleshoot Search: A Practical GuideHow to Manage and Troubleshoot Search: A Practical Guide
How to Manage and Troubleshoot Search: A Practical Guide
 
Design Reviews for Operations - Velocity Europe 2014
Design Reviews for Operations - Velocity Europe 2014Design Reviews for Operations - Velocity Europe 2014
Design Reviews for Operations - Velocity Europe 2014
 
Open Data and APIs - DataWeave
Open Data and APIs - DataWeaveOpen Data and APIs - DataWeave
Open Data and APIs - DataWeave
 
Multi View Constructed Right
Multi View Constructed RightMulti View Constructed Right
Multi View Constructed Right
 
Engage 2019: Modernising Your Domino and XPages Applications
Engage 2019: Modernising Your Domino and XPages Applications Engage 2019: Modernising Your Domino and XPages Applications
Engage 2019: Modernising Your Domino and XPages Applications
 
Webinar: Personalized Retail Search & Recommendations with Fusion
Webinar: Personalized Retail Search & Recommendations with FusionWebinar: Personalized Retail Search & Recommendations with Fusion
Webinar: Personalized Retail Search & Recommendations with Fusion
 
UI Dev in Big data world using open source
UI Dev in Big data world using open sourceUI Dev in Big data world using open source
UI Dev in Big data world using open source
 
NoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-lessNoSQL Simplified: Schema vs. Schema-less
NoSQL Simplified: Schema vs. Schema-less
 
SPSOrlando - InfoPath 2010 Jumpstarter
SPSOrlando - InfoPath 2010 JumpstarterSPSOrlando - InfoPath 2010 Jumpstarter
SPSOrlando - InfoPath 2010 Jumpstarter
 
apidays LIVE Paris 2021 - Detecting and Protecting PII by Rob Dickinson, Resu...
apidays LIVE Paris 2021 - Detecting and Protecting PII by Rob Dickinson, Resu...apidays LIVE Paris 2021 - Detecting and Protecting PII by Rob Dickinson, Resu...
apidays LIVE Paris 2021 - Detecting and Protecting PII by Rob Dickinson, Resu...
 
Cloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application DevelopmentCloud-based Linked Data Management for Self-service Application Development
Cloud-based Linked Data Management for Self-service Application Development
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data Lake
 
Exploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban ForestryExploring Data Preparation and Visualization Tools for Urban Forestry
Exploring Data Preparation and Visualization Tools for Urban Forestry
 
Data warehouseold
Data warehouseoldData warehouseold
Data warehouseold
 
Cloud patterns at Carleton University
Cloud patterns at Carleton UniversityCloud patterns at Carleton University
Cloud patterns at Carleton University
 

Mehr von Jeni Tennison

Why Everyone Needs an Open Data Strategy
Why Everyone Needs an Open Data StrategyWhy Everyone Needs an Open Data Strategy
Why Everyone Needs an Open Data Strategy
Jeni Tennison
 
Commercial value of open data
Commercial value of open dataCommercial value of open data
Commercial value of open data
Jeni Tennison
 

Mehr von Jeni Tennison (11)

How much faith should we have in data? - ODI Friday Lunchtime Lecture
How much faith should we have in data? - ODI Friday Lunchtime LectureHow much faith should we have in data? - ODI Friday Lunchtime Lecture
How much faith should we have in data? - ODI Friday Lunchtime Lecture
 
The challenges of building a strong data infrastructure
The challenges of building a strong data infrastructureThe challenges of building a strong data infrastructure
The challenges of building a strong data infrastructure
 
BCS Address Day - Open Addresses
BCS Address Day - Open AddressesBCS Address Day - Open Addresses
BCS Address Day - Open Addresses
 
Knowledge for Everyone
Knowledge for EveryoneKnowledge for Everyone
Knowledge for Everyone
 
Why Everyone Needs an Open Data Strategy
Why Everyone Needs an Open Data StrategyWhy Everyone Needs an Open Data Strategy
Why Everyone Needs an Open Data Strategy
 
Open Data: A New tool for Government
Open Data: A New tool for GovernmentOpen Data: A New tool for Government
Open Data: A New tool for Government
 
Commercial value of open data
Commercial value of open dataCommercial value of open data
Commercial value of open data
 
Legislation.gov.uk
Legislation.gov.ukLegislation.gov.uk
Legislation.gov.uk
 
Collisions, Chimera and Consonance in Web Content
Collisions, Chimera and Consonance in Web ContentCollisions, Chimera and Consonance in Web Content
Collisions, Chimera and Consonance in Web Content
 
How the Web of Data Will be Won
How the Web of Data Will be WonHow the Web of Data Will be Won
How the Web of Data Will be Won
 
OpenTech 2008: Power of Information - Rewiring the London Gazette with RDFa
OpenTech 2008: Power of Information - Rewiring the London Gazette with RDFaOpenTech 2008: Power of Information - Rewiring the London Gazette with RDFa
OpenTech 2008: Power of Information - Rewiring the London Gazette with RDFa
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Data All the Way Down

  • 1. Data All the Way Down Jeni Tennison @JeniT http://www.jenitennison.com/blog/
  • 2. Data All the Way Down • challenges of complex open data • layered approach to data publishing • essential steps • benefits
  • 3. Complex Datasets • too much for a single spreadsheet • need to navigate • browse through data • look at slices of larger dataset • view summary statistics • need to explain • definitions of terms, provisos & disclaimers
  • 4. User Challenge • complex data sets have range of users • different hardware / platforms • different tasks / goals • different ability / understanding • no one interface satisfies everyone • data owners cannot satisfy everyone • create ecosystem around open data
  • 5. visualisation / data gap end user vs reuser
  • 6. Visualisations • approachable for real people • necessary for stakeholder buy-in • beauty is in what's left out • advertisement or taster of rich datasets • often not possible in official data • leaves questions unanswered • what if we looked at the data in a different way?
  • 7. Raw Data • importable into own data store • often only interested in particular slice • data set may be massive / changing • run whatever analysis you want • requires at least some programming skills • analysis might not be appropriate for the data • documentation probably lacking
  • 8. bridging the gap layered data access Photo by Nikita Kravchuk http://www.flickr.com/photos/mi55er/3845619153/
  • 9. Layered Architecture • user interface • navigation and global understanding • API • curated, targeted, programmable access • query • free-form programmable access • raw data
  • 10. legislation.gov.uk lists as Atom feeds
  • 11. legislation.gov.uk content as XML
  • 12. legislation.gov.uk layer other views
  • 13. organograms navigable visualisation
  • 14. organograms JSON data
  • 15. organograms RDF / XML / HTML
  • 16. organograms SPARQL query
  • 17. organograms raw data
  • 18. Key Techniques • resource-driven design (good URIs) • every page built based on API calls • explicit links to API access • for bonus points, link to your transformation code • consistent terminology • clear mapping from UI to API • caching & access control at each level
  • 19. Benefits • fork at any point • don't like the visualisation / API? create your own! • everyone is human • reusers gain understanding from user interface • visualisation benefits the stack • API oriented towards achieving a goal • visual validation of data improves quality