SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Downloaden Sie, um offline zu lesen
Tabular Data on the Web
Intro to W3C CSV on the Web Specifications
Gregg Kellogg
gregg@greggkellogg.net
@gkellogg
1
Impact of Tabular Data
• Tabular Data represents a large amount of all
data published on the Web
• According to the Open Data Institute, the vast
majority of published open data is tabular
• “Over 90% of the data on data.gov.uk is
tabular data.”
• data.gov lists 158,631 datasets; largely in CSV
2
Sources of Tabular Data
• Easiest way to publish data
• Spreadsheet Dumps
• Database Dumps
• SPARQL results
3
CSV data is dumb
• It’s a simple text format, data has no inherent
meaning.
• Cells may be data-typed or have a regular
format: what does “08/09/2015” mean?
• Cells may be related to data in other tables/
columns: Foreign Keys
• Cells may be associated with different entities:
Join results
4
Web CSV
• 5-star Linked Data
• CSV URLs
• CSVs link to other CSVs
• CSVs link to other
Resources
• RDF and JSON
conversion
5
W3C CSV on the Web
• Working Group chartered to allow applications to provide higher
interoperability with working with CSV, or similar formats.
• Use Cases: http://www.w3.org/TR/csvw-ucr/
• Model for Tabular Data and Metadata on the Web: http://
www.w3.org/TR/tabular-data-model/
• Metadata Vocabulary for Tabular Data: http://www.w3.org/TR/tabular-
metadata/
• Generating JSON from Tabular Data on the Web: http://www.w3.org/
TR/csv2json/
• Generating RDF from Tabular Data on the Web: http://www.w3.org/
TR/csv2rdf/
6
Examples
7
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab Emirates
AF 33.9 67.7 Afghanistan
countries.csv
countryRef year population
AF 1960 9,616,353
AF 1961 9,799,379
AF 1961 9,989,846
country_slice.csv
Model for Tabular Data
id
Table Group
id
Table
notes
transformations
about URL
cells
datatype
default
Column
lang
name
number
ordered
property URL
required
separator
table
text direction
titles
value URL
virtual
cells
number
primary key
titles
Row
referenced rows
source number
table
about URL
column
errors
ordered
Cell
property URL
row
string value
table
text direction
value
value URL
8
notes
foreign keys
other annotations
url
other annotations
tables
columns
rows
table direction
other annotations
rows
table
Mapping CSV to Model
• Parse CSV: RFC4180 + dialect metadata.
• delimiter, doubleQuote, headerRowCount,
lineTerminators, quoteChar, …
• Dialect Description comes from Metadata Document.
• Match Headers to Columns.
• Parse Cells using Column metadata/datatype.
• Abstract data model used for viewing, validation, and
conversions.
9
Metadata
• Finding Metadata from a CSV
• User-specified, Link Header, well-known
locations
• Matching Metadata to a CSV
• CSV must be compatible with metadata (titles/
names)
• Metadata must reference CSV URL
10
foreignKeys
columns
@id
@type
Schema
primaryKey
rowTitles
11
url
targetFormat
scriptFormat
titles
source
@id
@type
Transformation
Definition
name
titles
required
suppressOutput
virtual
@id
@type
Column Description
columnReference
reference
Foreign Key
Definition
resource
schemaReference
columnReference
Foreign Key
Reference
array property
link property
URI template property
column reference property
object property
natural language property
atomic property
Legend:
reference to an array of values of a specific category
reference to a value of a specific category
@language
@base
Top-Level
Properties
tables
transformations
tableDirection
tableSchema
dialect
@context
@id
Table Group
notes
@type
decimalChar
groupChar
pattern
Number Format
url
transformations
tableDirection
tableSchema
dialect
notes
Table
@context
@id
@type
suppressOutput
null
lang
textDirection
separator
ordered
default
datatype
Inherited Properties
aboutUrl
propertyUrl
valueUrl
required
base
format
length
minLength
maxLength
minimum
maximum
Datatype
Description
minInclusive
maxInclusive
minExclusive
maxExclusive
@id
@type
encoding
lineTerminators
quoteChar
doubleQuote
skipRows
commentPrefix
header
Dialect Description
headerRowCount
skipBlankRows
skipInitialSpace
trim
@id
delimiter
skipColumns
Schema
• Column Descriptions
• Names/Titles
• Datatype
• Primary Keys
• Foreign Key Relationships
12
Embedded Metadata
• Generally Column Titles.
• Formats may define CSV conventions for
embedded metadata.
• Principally used to determine metadata
compatibility.
• Also serves as default metadata if no file
located.
13
Datatypes
• Basic XSD datatypes
• maximum/minimum facets
• minLength/maxLength facets
• format/pattern
• RegExp, Boolean, UAX35 date/time picture
string, UAX35 number picture string
14
Other Features
• Split cells into multiple items
• Validate Primary Keys and Foreign Key
references (single and multiple columns)
• Define URL properties for columns
• Multiple subjects per column (may be URLs)
• Values as URLs
15
Conversions: JSON
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
countries.csv
16
{
"tables": [{
"url": "http://example.org/countries.csv",
"row": [{
"url": "http://example.org/countries.csv#row=2",
"rownum": 1,
"describes": [{
"countryCoe": "AD",
"latitude": "42.5",
"longitude": "1.6",
"name": "Andorra"
}]
}, {
"url": "http://example.org/countries.csv#row=3",
"rownum": 2,
"describes": [{
"countryCode": "AE",
"latitude": "23.4",
"longitude": "53.8",
"name": "United Arab Emirates"
}]
}, {
"url": "http://example.org/countries.csv#row=4",
"rownum": 3,
"describes": [{
"countryCode": "AF",
"latitude": "33.9",
"longitude": "67.7",
"name": "Afghanistan"
}]
}]
}]
}
countries.json
countries-standard.json
Conversions: JSON (min)
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
17
[{
"countryCode": "AD",
"latitude": "42.5",
"longitude": "1.6",
"name": "Andorra"
}, {
"countryCode": "AE",
"latitude": "23.4",
"longitude": "53.8",
"name": "United Arab Emirates"
}, {
"countryCode": "AF",
"latitude": "33.9",
"longitude": "67.7",
"name": "Afghanistan"
}]
countries.csv
countries.json
countries-minimal.json
Conversions: RDF
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
18
@base <http://example.org/countries.csv> .
@prefix csvw: <http://www.w3.org/ns/csvw#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
_:tg a csvw:TableGroup ;
csvw:table [ a csvw:Table ;
csvw:url <http://example.org/countries.csv> ;
csvw:row [ a csvw:Row ;
csvw:rownum "1"^^xsd:integer ;
csvw:url <#row=2> ;
csvw:describes _:t1r1
], [ a csvw:Row ;
csvw:rownum "2"^^xsd:integer ;
csvw:url <#row=3> ;
csvw:describes _:t1r2
], [ a csvw:Row ;
csvw:rownum "3"^^xsd:integer ;
csvw:url <#row=4> ;
csvw:describes _:t1r3
]
] .
_:t1r1
<#countryCode> "AD" ;
<#latitude> "42.5" ;
<#longitude> "1.6" ;
<#name> "Andorra" .
_:t1r2
<#countryCode> "AE" ;
<#latitude> "23.4" ;
<#longitude> "53.8" ;
<#name> "United Arab Emirates" .
_:t1r3
<#countryCode> "AF" ;
<#latitude> "33.9" ;
<#longitude> "67.7" ;
<#name> "Afghanistan" .
countries.csv
countries.json
countries-standard.ttl
Conversions: RDF (min)
countryCode latitude longitude name
AD 42.5 1.6 Andorra
AE 23.4 53.8 United Arab
Emirates
AF 33.9 67.7 Afghanistan
19
@base <http://example.org/countries.csv> .
_:t1r1
<#countryCode> "AD" ;
<#latitude> "42.5" ;
<#longitude> "1.6" ;
<#name> "Andorra" .
_:t1r2
<#countryCode> "AE" ;
<#latitude> "23.4" ;
<#longitude> "53.8" ;
<#name> "United Arab Emirates" .
_:t1r3
<#countryCode> "AF" ;
<#latitude> "33.9" ;
<#longitude> "67.7" ;
<#name> "Afghanistan" .
countries.csv
countries.json
countries-minimal.ttl
Other examples
• Rich Annotations: JSON RDF
• Virtual Columns/Multiple Subjects: JSON RDF
• For more see Specifications and Test Suite
20
Tools
• CSVLint
• CKAN – open source data portal platform
• Socrata – cloud-based open data
• Google Fusion Tables – data visualization
• Ruby rdf-tabular – CSVW reference implementation
• RDF Distiller
• Structured Data Linter
21
Next Steps
• At-Risk – /.well-known/csvm
• More datatype formats
• Metadata in HTML (embedded JSON-LD)
• Tabular Data in HTML
• More implementations!
• Timeline
• Candidate Recommendation – July 2015
• Proposed Recommendation – Oct 2015
• W3C Recommendation – Dec 2015
22
More Information
GitHub
w3c
Gregg Kellogg
@gkellogg
gregg@greggkellogg.net
http://greggkellogg.net/
http://www.slideshare.net/gkellogg1/tabular-data-on-the-web
distiller
linterSlideshare

Weitere ähnliche Inhalte

Was ist angesagt?

SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n Bolts
Rinke Hoekstra
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDB
MongoDB
 

Was ist angesagt? (20)

SemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n BoltsSemanticWeb Nuts 'n Bolts
SemanticWeb Nuts 'n Bolts
 
Performance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4jPerformance comparison: Multi-Model vs. MongoDB and Neo4j
Performance comparison: Multi-Model vs. MongoDB and Neo4j
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
 
Building Spring Data with MongoDB
Building Spring Data with MongoDBBuilding Spring Data with MongoDB
Building Spring Data with MongoDB
 
SHACL Overview
SHACL OverviewSHACL Overview
SHACL Overview
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
JSON-LD
JSON-LDJSON-LD
JSON-LD
 
Searching Relational Data with Elasticsearch
Searching Relational Data with ElasticsearchSearching Relational Data with Elasticsearch
Searching Relational Data with Elasticsearch
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
 
Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?Is multi-model the future of NoSQL?
Is multi-model the future of NoSQL?
 
OrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data RelationshipsOrientDB: Unlock the Value of Document Data Relationships
OrientDB: Unlock the Value of Document Data Relationships
 
Describing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core VocabularyDescribing LDP Applications with the Hydra Core Vocabulary
Describing LDP Applications with the Hydra Core Vocabulary
 
Introduction to W3C Linked Data Platform
Introduction to W3C Linked Data PlatformIntroduction to W3C Linked Data Platform
Introduction to W3C Linked Data Platform
 
Linked Data Fragments
Linked Data FragmentsLinked Data Fragments
Linked Data Fragments
 
Overview of GraphQL & Clients
Overview of GraphQL & ClientsOverview of GraphQL & Clients
Overview of GraphQL & Clients
 
Building your first app with MongoDB
Building your first app with MongoDBBuilding your first app with MongoDB
Building your first app with MongoDB
 
ArangoDB
ArangoDBArangoDB
ArangoDB
 
Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.
 
Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)Introduction to Linked Data Platform (LDP)
Introduction to Linked Data Platform (LDP)
 
Gerry McNicol Graph Databases
Gerry McNicol Graph DatabasesGerry McNicol Graph Databases
Gerry McNicol Graph Databases
 

Andere mochten auch

Approaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual ImpairmentApproaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual Impairment
Rajnish Kumar Arya
 
Visual impairment
Visual impairmentVisual impairment
Visual impairment
Cachelle
 
Visual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching StrategiesVisual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching Strategies
Mauro Garcia
 
Visual Impairment
Visual ImpairmentVisual Impairment
Visual Impairment
aniwilfi
 
Policies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the PhilippinesPolicies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the Philippines
maria martha manette madrid
 

Andere mochten auch (20)

RDFS In A Nutshell V1
RDFS In A Nutshell V1RDFS In A Nutshell V1
RDFS In A Nutshell V1
 
Kxu stat-anderson-ch02
Kxu stat-anderson-ch02Kxu stat-anderson-ch02
Kxu stat-anderson-ch02
 
V.i.new
V.i.newV.i.new
V.i.new
 
Tutorials--Logarithmic Functions in Tabular and Graph Form
Tutorials--Logarithmic Functions in Tabular and Graph Form	Tutorials--Logarithmic Functions in Tabular and Graph Form
Tutorials--Logarithmic Functions in Tabular and Graph Form
 
Approaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual ImpairmentApproaches to Develop Curriculum for Children Visual Impairment
Approaches to Develop Curriculum for Children Visual Impairment
 
V.i. ppt copy
V.i. ppt   copyV.i. ppt   copy
V.i. ppt copy
 
CRL: A Rule Language for Table Analysis and Interpretation
CRL: A Rule Language for Table Analysis and InterpretationCRL: A Rule Language for Table Analysis and Interpretation
CRL: A Rule Language for Table Analysis and Interpretation
 
Visual impairment
Visual impairmentVisual impairment
Visual impairment
 
Visual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching StrategiesVisual Impairment Information and Teaching Strategies
Visual Impairment Information and Teaching Strategies
 
Ontologies pour le Web 2.0
Ontologies pour le Web 2.0Ontologies pour le Web 2.0
Ontologies pour le Web 2.0
 
Ses 4 tabulation
Ses 4 tabulationSes 4 tabulation
Ses 4 tabulation
 
Construction ontologies
Construction ontologiesConstruction ontologies
Construction ontologies
 
Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)Ontology In A Nutshell (version 2)
Ontology In A Nutshell (version 2)
 
Visual Impairment
Visual ImpairmentVisual Impairment
Visual Impairment
 
visual impairment
visual impairmentvisual impairment
visual impairment
 
visual impairment
visual impairment visual impairment
visual impairment
 
Visual Impairments
Visual ImpairmentsVisual Impairments
Visual Impairments
 
Ncf 2005
Ncf 2005Ncf 2005
Ncf 2005
 
Frequency Distributions and Graphs
Frequency Distributions and GraphsFrequency Distributions and Graphs
Frequency Distributions and Graphs
 
Policies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the PhilippinesPolicies and Guidelines of Special Education in the Philippines
Policies and Guidelines of Special Education in the Philippines
 

Ähnlich wie Tabular Data on the Web

EAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introductionEAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introduction
timothyryan50
 

Ähnlich wie Tabular Data on the Web (20)

aip-workshop1-dev-tutorial
aip-workshop1-dev-tutorialaip-workshop1-dev-tutorial
aip-workshop1-dev-tutorial
 
Expose your data as an api is with oracle rest data services -spoug Madrid
Expose your data as an api is with oracle rest data services -spoug MadridExpose your data as an api is with oracle rest data services -spoug Madrid
Expose your data as an api is with oracle rest data services -spoug Madrid
 
The never-ending REST API design debate
The never-ending REST API design debateThe never-ending REST API design debate
The never-ending REST API design debate
 
Building RESTfull Data Services with WebAPI
Building RESTfull Data Services with WebAPIBuilding RESTfull Data Services with WebAPI
Building RESTfull Data Services with WebAPI
 
MuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and ODataMuleSoft London Community February 2020 - MuleSoft and OData
MuleSoft London Community February 2020 - MuleSoft and OData
 
Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21Flexible metadata schemes for research data repositories - CLARIN Conference'21
Flexible metadata schemes for research data repositories - CLARIN Conference'21
 
Flexible metadata schemes for research data repositories - Clarin Conference...
Flexible metadata schemes for research data repositories  - Clarin Conference...Flexible metadata schemes for research data repositories  - Clarin Conference...
Flexible metadata schemes for research data repositories - Clarin Conference...
 
Data science at the command line
Data science at the command lineData science at the command line
Data science at the command line
 
2. Content Registration
2. Content Registration2. Content Registration
2. Content Registration
 
Semantic framework for web scraping.
Semantic framework for web scraping.Semantic framework for web scraping.
Semantic framework for web scraping.
 
5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics5\9 SSIS 2008R2_Training - DataFlow Basics
5\9 SSIS 2008R2_Training - DataFlow Basics
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
SAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-PointSAS Online Training Institute in Hyderabad - C-Point
SAS Online Training Institute in Hyderabad - C-Point
 
The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016The never-ending REST API design debate -- Devoxx France 2016
The never-ending REST API design debate -- Devoxx France 2016
 
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & KafkaSelf-Service Data Ingestion Using NiFi, StreamSets & Kafka
Self-Service Data Ingestion Using NiFi, StreamSets & Kafka
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
EAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introductionEAD Revision, EAC-CPF introduction
EAD Revision, EAC-CPF introduction
 
Information Intermediaries
Information IntermediariesInformation Intermediaries
Information Intermediaries
 

Kürzlich hochgeladen

valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 

Kürzlich hochgeladen (20)

All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
VVIP Pune Call Girls Mohammadwadi WhatSapp Number 8005736733 With Elite Staff...
VVIP Pune Call Girls Mohammadwadi WhatSapp Number 8005736733 With Elite Staff...VVIP Pune Call Girls Mohammadwadi WhatSapp Number 8005736733 With Elite Staff...
VVIP Pune Call Girls Mohammadwadi WhatSapp Number 8005736733 With Elite Staff...
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Dubai Call Girls Milky O525547819 Call Girls Dubai Soft Dating
Dubai Call Girls Milky O525547819 Call Girls Dubai Soft DatingDubai Call Girls Milky O525547819 Call Girls Dubai Soft Dating
Dubai Call Girls Milky O525547819 Call Girls Dubai Soft Dating
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
Russian Call Girls Pune  (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...Russian Call Girls Pune  (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...
 
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
Call Girls Sangvi Call Me 7737669865 Budget Friendly No Advance BookingCall G...
 
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
Wagholi & High Class Call Girls Pune Neha 8005736733 | 100% Gennuine High Cla...
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
Enjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Samalka Delhi >༒8448380779 Escort Service
 
Al Barsha Night Partner +0567686026 Call Girls Dubai
Al Barsha Night Partner +0567686026 Call Girls  DubaiAl Barsha Night Partner +0567686026 Call Girls  Dubai
Al Barsha Night Partner +0567686026 Call Girls Dubai
 
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls DubaiDubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
Dubai=Desi Dubai Call Girls O525547819 Outdoor Call Girls Dubai
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
Ganeshkhind ! Call Girls Pune - 450+ Call Girl Cash Payment 8005736733 Neha T...
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 

Tabular Data on the Web

  • 1. Tabular Data on the Web Intro to W3C CSV on the Web Specifications Gregg Kellogg gregg@greggkellogg.net @gkellogg 1
  • 2. Impact of Tabular Data • Tabular Data represents a large amount of all data published on the Web • According to the Open Data Institute, the vast majority of published open data is tabular • “Over 90% of the data on data.gov.uk is tabular data.” • data.gov lists 158,631 datasets; largely in CSV 2
  • 3. Sources of Tabular Data • Easiest way to publish data • Spreadsheet Dumps • Database Dumps • SPARQL results 3
  • 4. CSV data is dumb • It’s a simple text format, data has no inherent meaning. • Cells may be data-typed or have a regular format: what does “08/09/2015” mean? • Cells may be related to data in other tables/ columns: Foreign Keys • Cells may be associated with different entities: Join results 4
  • 5. Web CSV • 5-star Linked Data • CSV URLs • CSVs link to other CSVs • CSVs link to other Resources • RDF and JSON conversion 5
  • 6. W3C CSV on the Web • Working Group chartered to allow applications to provide higher interoperability with working with CSV, or similar formats. • Use Cases: http://www.w3.org/TR/csvw-ucr/ • Model for Tabular Data and Metadata on the Web: http:// www.w3.org/TR/tabular-data-model/ • Metadata Vocabulary for Tabular Data: http://www.w3.org/TR/tabular- metadata/ • Generating JSON from Tabular Data on the Web: http://www.w3.org/ TR/csv2json/ • Generating RDF from Tabular Data on the Web: http://www.w3.org/ TR/csv2rdf/ 6
  • 7. Examples 7 countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan countries.csv countryRef year population AF 1960 9,616,353 AF 1961 9,799,379 AF 1961 9,989,846 country_slice.csv
  • 8. Model for Tabular Data id Table Group id Table notes transformations about URL cells datatype default Column lang name number ordered property URL required separator table text direction titles value URL virtual cells number primary key titles Row referenced rows source number table about URL column errors ordered Cell property URL row string value table text direction value value URL 8 notes foreign keys other annotations url other annotations tables columns rows table direction other annotations rows table
  • 9. Mapping CSV to Model • Parse CSV: RFC4180 + dialect metadata. • delimiter, doubleQuote, headerRowCount, lineTerminators, quoteChar, … • Dialect Description comes from Metadata Document. • Match Headers to Columns. • Parse Cells using Column metadata/datatype. • Abstract data model used for viewing, validation, and conversions. 9
  • 10. Metadata • Finding Metadata from a CSV • User-specified, Link Header, well-known locations • Matching Metadata to a CSV • CSV must be compatible with metadata (titles/ names) • Metadata must reference CSV URL 10
  • 11. foreignKeys columns @id @type Schema primaryKey rowTitles 11 url targetFormat scriptFormat titles source @id @type Transformation Definition name titles required suppressOutput virtual @id @type Column Description columnReference reference Foreign Key Definition resource schemaReference columnReference Foreign Key Reference array property link property URI template property column reference property object property natural language property atomic property Legend: reference to an array of values of a specific category reference to a value of a specific category @language @base Top-Level Properties tables transformations tableDirection tableSchema dialect @context @id Table Group notes @type decimalChar groupChar pattern Number Format url transformations tableDirection tableSchema dialect notes Table @context @id @type suppressOutput null lang textDirection separator ordered default datatype Inherited Properties aboutUrl propertyUrl valueUrl required base format length minLength maxLength minimum maximum Datatype Description minInclusive maxInclusive minExclusive maxExclusive @id @type encoding lineTerminators quoteChar doubleQuote skipRows commentPrefix header Dialect Description headerRowCount skipBlankRows skipInitialSpace trim @id delimiter skipColumns
  • 12. Schema • Column Descriptions • Names/Titles • Datatype • Primary Keys • Foreign Key Relationships 12
  • 13. Embedded Metadata • Generally Column Titles. • Formats may define CSV conventions for embedded metadata. • Principally used to determine metadata compatibility. • Also serves as default metadata if no file located. 13
  • 14. Datatypes • Basic XSD datatypes • maximum/minimum facets • minLength/maxLength facets • format/pattern • RegExp, Boolean, UAX35 date/time picture string, UAX35 number picture string 14
  • 15. Other Features • Split cells into multiple items • Validate Primary Keys and Foreign Key references (single and multiple columns) • Define URL properties for columns • Multiple subjects per column (may be URLs) • Values as URLs 15
  • 16. Conversions: JSON countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan countries.csv 16 { "tables": [{ "url": "http://example.org/countries.csv", "row": [{ "url": "http://example.org/countries.csv#row=2", "rownum": 1, "describes": [{ "countryCoe": "AD", "latitude": "42.5", "longitude": "1.6", "name": "Andorra" }] }, { "url": "http://example.org/countries.csv#row=3", "rownum": 2, "describes": [{ "countryCode": "AE", "latitude": "23.4", "longitude": "53.8", "name": "United Arab Emirates" }] }, { "url": "http://example.org/countries.csv#row=4", "rownum": 3, "describes": [{ "countryCode": "AF", "latitude": "33.9", "longitude": "67.7", "name": "Afghanistan" }] }] }] } countries.json countries-standard.json
  • 17. Conversions: JSON (min) countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan 17 [{ "countryCode": "AD", "latitude": "42.5", "longitude": "1.6", "name": "Andorra" }, { "countryCode": "AE", "latitude": "23.4", "longitude": "53.8", "name": "United Arab Emirates" }, { "countryCode": "AF", "latitude": "33.9", "longitude": "67.7", "name": "Afghanistan" }] countries.csv countries.json countries-minimal.json
  • 18. Conversions: RDF countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan 18 @base <http://example.org/countries.csv> . @prefix csvw: <http://www.w3.org/ns/csvw#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . _:tg a csvw:TableGroup ; csvw:table [ a csvw:Table ; csvw:url <http://example.org/countries.csv> ; csvw:row [ a csvw:Row ; csvw:rownum "1"^^xsd:integer ; csvw:url <#row=2> ; csvw:describes _:t1r1 ], [ a csvw:Row ; csvw:rownum "2"^^xsd:integer ; csvw:url <#row=3> ; csvw:describes _:t1r2 ], [ a csvw:Row ; csvw:rownum "3"^^xsd:integer ; csvw:url <#row=4> ; csvw:describes _:t1r3 ] ] . _:t1r1 <#countryCode> "AD" ; <#latitude> "42.5" ; <#longitude> "1.6" ; <#name> "Andorra" . _:t1r2 <#countryCode> "AE" ; <#latitude> "23.4" ; <#longitude> "53.8" ; <#name> "United Arab Emirates" . _:t1r3 <#countryCode> "AF" ; <#latitude> "33.9" ; <#longitude> "67.7" ; <#name> "Afghanistan" . countries.csv countries.json countries-standard.ttl
  • 19. Conversions: RDF (min) countryCode latitude longitude name AD 42.5 1.6 Andorra AE 23.4 53.8 United Arab Emirates AF 33.9 67.7 Afghanistan 19 @base <http://example.org/countries.csv> . _:t1r1 <#countryCode> "AD" ; <#latitude> "42.5" ; <#longitude> "1.6" ; <#name> "Andorra" . _:t1r2 <#countryCode> "AE" ; <#latitude> "23.4" ; <#longitude> "53.8" ; <#name> "United Arab Emirates" . _:t1r3 <#countryCode> "AF" ; <#latitude> "33.9" ; <#longitude> "67.7" ; <#name> "Afghanistan" . countries.csv countries.json countries-minimal.ttl
  • 20. Other examples • Rich Annotations: JSON RDF • Virtual Columns/Multiple Subjects: JSON RDF • For more see Specifications and Test Suite 20
  • 21. Tools • CSVLint • CKAN – open source data portal platform • Socrata – cloud-based open data • Google Fusion Tables – data visualization • Ruby rdf-tabular – CSVW reference implementation • RDF Distiller • Structured Data Linter 21
  • 22. Next Steps • At-Risk – /.well-known/csvm • More datatype formats • Metadata in HTML (embedded JSON-LD) • Tabular Data in HTML • More implementations! • Timeline • Candidate Recommendation – July 2015 • Proposed Recommendation – Oct 2015 • W3C Recommendation – Dec 2015 22