LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

•Als PPTX, PDF herunterladen•

2 gefällt mir•1,028 views

Governments, public agencies and institutions, and companies produce a great amount of statistical data every year. Much of these data are released as Open Data and published on the Web, although usually as documents, not as Linked Data. In this talk I'll introduce RDF Data Cube (QB), a W3C standard for publishing multidimensional data, such as statistics, on the Web in such a way that they can be linked to other datasets and concepts. However, QB is pretty open towards how users should model dimensions and codes (variables and values in QB jargon), which hampers reusability of existing ones. To this end, I'll show you LSD Dimensions, a web based application that monitors the usage of dimensions and codes over five hundred public SPARQL endpoints.

Daten & Analysen

LSD Dimensions
Use and Reuse of Linked
Statistical Data as RDF Data Cube
Albert Meroño-Peñuela
@albertmeronyo
WAI meeting 06-10-2014

Towards 5-star Linked Statistical Data
DFT

Towards 5-star Linked Statistical Data
DFT
Eurostat TSV

RDF Data Cube
• 4-star LSD: use URIs to denote (statistical)
things
• 5-star LSD: link own (statistical) things to
other (statistical) things
“There are many situations where it would be useful to
be able to publish multi-dimensional data, such as
statistics, on the web in such a way that they can be
linked to related data sets and concepts.”

RDF Data Cube vocabulary (QB)
• SDMX compatible
• Defines cubes as a set of observations that consist of
dimensions, measures and attributes
• Dimensions: time period, region, sex (qb:DimensionProperty)
• Measure: population life expectancy (qb:MeasureProperty)
• Attribute: unit of measure = years, metadata status =
measured (qb:AttributeProperty)
Observation: “the measured life expectancy of males in
Newport in the period 2004-2006 is 76.7 years”

5-star LSD: 270a.info
Sarven Capadisli, Sören Auer, Reinhard Riedl. “Linked Statistical Data Analysis”. 1st Int.
Workshop on Semantic Statistics (SemStats) ISWC 2013.

Are we done?
• P1: Comparability? Can we arbitrarily
combine any pair of these
datasets/dimensions?
• P2: Reusability? How often are dimensions
reused? Can we reuse dimensions created by
others?
• P3: Discoverability? How to discover
dimensions created by others?
• P4: Relevance? What’s the size of LSD?

P1: Comparability of LSD: SSCLSDA
Sarven Capadisli, Albert Meroño-Peñuela, Sören Auer, Reinhard Riedl. “Semantic Similarity
and Correlation of Linked Statistical Data Analysis”. 2nd Int. Workshop on Semantic Statistics
(SemStats) ISWC 2014.

P2+P3+P4: LSD Dimensions
Need for an intelligent system that helps us on (1)
discovering (2) reusing (3) analyzing dimensions in LSD

Are we done?
• P1: Comparability? Can we arbitrarily combine
any pair of these datasets/dimensions? Unclear
• P2: Reusability? How often are dimensions
reused? Can we reuse dimensions created by
others? Logarithmic law / Probably yes
• P3: Discoverability? How to discover dimensions
created by others? LSD Dimensions
• P4: Relevance? What’s the size of LSD? ~8.5% of
the LOD cloud

Future Work
• Monitor additional metadata
(rdfs:subPropertyOf, rdfs:range)
• Generate PROV during crawling
• Modeling of formulas in RDF Data Cube
• Plug to LOD Laundromat
• Crawl dimensions and codes from
qb:Observation
• SPARQL endpoint and API
– Suggest dimensions and codes to users

Thank you
Questions, suggestions, comments most
welcome
@albertmeronyo
http://lsd-dimensions.org/
https://github.com/albertmeronyo/LSD-Dimensions
https://github.com/csarven/sense-of-lsd-analysis
http://www.cedar-project.nl

Weitere ähnliche Inhalte

Andere mochten auch

PCPmmailhot91

DmtEman Abdelrazik

Truth about-lsd-booklet-ennipaalam

Drugsofabuseraj kumar

DMTMaryam Yasser

LsdAnna McCormick

Mdmashaina17

Drugs of Abuse: HallucinogensDr. DawnElise Snipes ★AllCEUs★ Unlimited Counselor Training

Andere mochten auch (8)

PCP

Dmt

Truth about-lsd-booklet-en

Drugsofabuse

DMT

Lsd

Mdma

Drugs of Abuse: Hallucinogens

Ähnlich wie LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

CEDAR & PRELIDA Preservation of Linked Socio-Historical DataPRELIDA Project

IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann

CBS CEDAR PresentationAlbert Meroño-Peñuela

Data Communities - reusable data in and outside your organization.Paul Groth

Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra

NC3Rs Publication Bias workshop - Sansone - Better Data = Better ScienceSusanna-Assunta Sansone

NaturalMSEQueries_presICWI2023.pdfAndré Valdestilhas

Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard

Semantic Similarity and Selection of Resources Published According to Linked ...Riccardo Albertoni

The web of data: how are we doing so farElena Simperl

JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...GUANGYUAN PIAO

Broad Data (India 2015)James Hendler

Sharing dataEdmund Chamberlain

Hide the Stack:Toward Usable Linked Dataaba-sah

Data Visualisation: Types, Principles, and ToolsSumandro C

Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017Holistic Benchmarking of Big Linked Data

Big data Intro - Presentation to OCHackerz Meetup GroupSri Kanajan

Quantifying the bias in data linksVrije Universiteit Amsterdam

ESWC 2014 Tutorial part 3Miriam Fernandez

Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleNational Information Standards Organization (NISO)

Ähnlich wie LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube (20)

CEDAR & PRELIDA Preservation of Linked Socio-Historical Data

IASSIST 2012 - DDI-RDF - Trouble with Triples

CBS CEDAR Presentation

Data Communities - reusable data in and outside your organization.

Managing Metadata for Science and Technology Studies: the RISIS case

NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science

NaturalMSEQueries_presICWI2023.pdf

Real-World Data Challenges: Moving Towards Richer Data Ecosystems

Semantic Similarity and Selection of Resources Published According to Linked ...

The web of data: how are we doing so far

JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...

Broad Data (India 2015)

Sharing data

Hide the Stack:Toward Usable Linked Data

Data Visualisation: Types, Principles, and Tools

Instance Matching Benchmarks in the ERA of Linked Data - ISWC2017

Big data Intro - Presentation to OCHackerz Meetup Group

Quantifying the bias in data links

ESWC 2014 Tutorial part 3

Full Erdmann Ruttenberg Community Approaches to Open Data at Scale

Mehr von Albert Meroño-Peñuela

List.MID: A MIDI-Based Benchmark for RDF ListsAlbert Meroño-Peñuela

Modelling and Querying Lists in RDF. A Pragmatic StudyAlbert Meroño-Peñuela

Making social science more reproducible by encapsulating access to linked dataAlbert Meroño-Peñuela

What can I expect from an academic career? Valuable skillsAlbert Meroño-Peñuela

The MIDI Linked Data CloudAlbert Meroño-Peñuela

Automatic Query-Centric API for Routine Access to Linked DataAlbert Meroño-Peñuela

One Score To Rule Them All: Semantics in Music NotationAlbert Meroño-Peñuela

Repeatable Semantic Queries for the Linked Data AgnosticAlbert Meroño-Peñuela

The Statistics of Stairway to Heaven: A Semantic Story About Digital HumanitiesAlbert Meroño-Peñuela

grlc: Bridging the Gap Between RESTful APIs and Linked DataAlbert Meroño-Peñuela

grlc Makes GitHub Taste Like Linked Data APIsAlbert Meroño-Peñuela

Historical Reasoning on the WebAlbert Meroño-Peñuela

How does a knowledge graph sound like? (or: music is a graph)Albert Meroño-Peñuela

What Is Linked Historical Data?Albert Meroño-Peñuela

Non-Temporal Orderings for Extensional Concept DriftAlbert Meroño-Peñuela

Detecting and Reporting Extensional Concept Drift in Statistical Linked DataAlbert Meroño-Peñuela

Semantic Web for the HumanitiesAlbert Meroño-Peñuela

Linked Census DataAlbert Meroño-Peñuela

Linked Humanities dataAlbert Meroño-Peñuela

Mehr von Albert Meroño-Peñuela (19)

List.MID: A MIDI-Based Benchmark for RDF Lists

Modelling and Querying Lists in RDF. A Pragmatic Study

Making social science more reproducible by encapsulating access to linked data

What can I expect from an academic career? Valuable skills

The MIDI Linked Data Cloud

Automatic Query-Centric API for Routine Access to Linked Data

One Score To Rule Them All: Semantics in Music Notation

Repeatable Semantic Queries for the Linked Data Agnostic

The Statistics of Stairway to Heaven: A Semantic Story About Digital Humanities

grlc: Bridging the Gap Between RESTful APIs and Linked Data

grlc Makes GitHub Taste Like Linked Data APIs

Historical Reasoning on the Web

How does a knowledge graph sound like? (or: music is a graph)

What Is Linked Historical Data?

Non-Temporal Orderings for Extensional Concept Drift

Detecting and Reporting Extensional Concept Drift in Statistical Linked Data

Semantic Web for the Humanities

Linked Census Data

Linked Humanities data

Kürzlich hochgeladen

CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion

Zuja dropshipping via API with DroFx.pptxolyaivanovalion

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823

Week-01-2.ppt BBB human Computer interactionfulawalesam

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692

Discover Why Less is More in B2B Researchmichael115558

Edukaciniai dropshipping via API with DroFxolyaivanovalion

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

Sampling (random) method and Non random.pptDr. Soumendra Kumar Patra

April 2024 - Crypto Market Report's Analysismanisha194592

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

Introduction-to-Machine-Learning (1).pptxfirstjob4

Invezz.com - Grow your wealth with trading signalsInvezz1

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7Call Girls in Nagpur High Profile Call Girls

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh9953056974 Low Rate Call Girls In Saket, Delhi NCR

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823

Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls

Ravak dropshipping via API with DroFx.pptxolyaivanovalion

Kürzlich hochgeladen (20)

CebaBaby dropshipping via API with DroFX.pptx

Zuja dropshipping via API with DroFx.pptx

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...

Week-01-2.ppt BBB human Computer interaction

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx

Discover Why Less is More in B2B Research

Edukaciniai dropshipping via API with DroFx

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec

Sampling (random) method and Non random.ppt

April 2024 - Crypto Market Report's Analysis

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE

Introduction-to-Machine-Learning (1).pptx

Invezz.com - Grow your wealth with trading signals

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...

Determinants of health, dimensions of health, positive health and spectrum of...

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779

Ravak dropshipping via API with DroFx.pptx

LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

1. LSD Dimensions Use and Reuse of Linked Statistical Data as RDF Data Cube Albert Meroño-Peñuela @albertmeronyo WAI meeting 06-10-2014

2. Statistics!

3. Data integration – 220 years ago

4. Data integration - nowadays

5. Data integration - nowadays

6. Towards 5-star Linked Statistical Data

7. Towards 5-star Linked Statistical Data

8. Towards 5-star Linked Statistical Data DFT

9. Towards 5-star Linked Statistical Data DFT Eurostat TSV

10. RDF Data Cube • 4-star LSD: use URIs to denote (statistical) things • 5-star LSD: link own (statistical) things to other (statistical) things “There are many situations where it would be useful to be able to publish multi-dimensional data, such as statistics, on the web in such a way that they can be linked to related data sets and concepts.”

11.

12.

13. RDF Data Cube vocabulary (QB) • SDMX compatible • Defines cubes as a set of observations that consist of dimensions, measures and attributes • Dimensions: time period, region, sex (qb:DimensionProperty) • Measure: population life expectancy (qb:MeasureProperty) • Attribute: unit of measure = years, metadata status = measured (qb:AttributeProperty) Observation: “the measured life expectancy of males in Newport in the period 2004-2006 is 76.7 years”

14. 5-star LSD: 270a.info Sarven Capadisli, Sören Auer, Reinhard Riedl. “Linked Statistical Data Analysis”. 1st Int. Workshop on Semantic Statistics (SemStats) ISWC 2013.

15. Are we done? • P1: Comparability? Can we arbitrarily combine any pair of these datasets/dimensions? • P2: Reusability? How often are dimensions reused? Can we reuse dimensions created by others? • P3: Discoverability? How to discover dimensions created by others? • P4: Relevance? What’s the size of LSD?

16. P1: Comparability of LSD: SSCLSDA Sarven Capadisli, Albert Meroño-Peñuela, Sören Auer, Reinhard Riedl. “Semantic Similarity and Correlation of Linked Statistical Data Analysis”. 2nd Int. Workshop on Semantic Statistics (SemStats) ISWC 2014.

17. P2+P3+P4: LSD Dimensions Need for an intelligent system that helps us on (1) discovering (2) reusing (3) analyzing dimensions in LSD

18. http://lsd-dimensions.org/

19. http://lsd-dimensions.org/

20.

21.

22.

23. Are we done? • P1: Comparability? Can we arbitrarily combine any pair of these datasets/dimensions? Unclear • P2: Reusability? How often are dimensions reused? Can we reuse dimensions created by others? Logarithmic law / Probably yes • P3: Discoverability? How to discover dimensions created by others? LSD Dimensions • P4: Relevance? What’s the size of LSD? ~8.5% of the LOD cloud

24. Future Work • Monitor additional metadata (rdfs:subPropertyOf, rdfs:range) • Generate PROV during crawling • Modeling of formulas in RDF Data Cube • Plug to LOD Laundromat • Crawl dimensions and codes from qb:Observation • SPARQL endpoint and API – Suggest dimensions and codes to users

25. Thank you Questions, suggestions, comments most welcome @albertmeronyo http://lsd-dimensions.org/ https://github.com/albertmeronyo/LSD-Dimensions https://github.com/csarven/sense-of-lsd-analysis http://www.cedar-project.nl

LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (8)

Ähnlich wie LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube

Ähnlich wie LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube (20)

Mehr von Albert Meroño-Peñuela

Mehr von Albert Meroño-Peñuela (19)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

LSD Dimensions: Use and Reuse of Linked Statistical Data as RDF Data Cube