SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Converting WHO’s Global
Health Observatory Data to
          RDF

   Amrapali Zaveri, PhD student

         August	
  27,	
  2012


                                  1
Outline
•   Background
•   What is the RDF Data Cube Vocabulary?
•   Semi-automated approach
•   OntoWiki's CSVImport plug-in
•   RDFized GHO data
•   Limitations and Future Work




                                            2
Background
• Biomedical statistical data
     • Published as Excel sheets
• Advantage
     • Readable by humans
• Disadvantages
     • Cannot be queried efficiently
     • Difficult to integrate with other data (in different formats)
• Our approach
     • Converting data into a single data model - RDF
     • Using the RDF Data Cube Vocabulary*
        • designed particularly to represent multidimensional
          statistical data using RDF.

 *http://www.w3.org/TR/vocab-data-cube/


                                                                  3
What is the RDF Data Cube
Vocabulary?




                            4
What is the RDF Data Cube
Vocabulary?

 • Dimensions
 • Attributes
 • Measures
 • Observations




                            5
Semi-automated approach
• Transforming CSV to RDF in a fully automated way is not
  feasible.
        • Dimensions may often be encoded in heading or
           label of a sheet
• Our semi-automatic approach:
        • As a plug-in in OntoWiki#
            • a semantic collaboration platform developed by
              the AKSW research group.
        • A CSV file is converted into RDF using the RDF
           Data Cube Vocabulary: http://aksw.org/Projects/
           Stats2RDF



 # Sören Auer, Sebastian Tramp (geb. Dietzold), Jens Lehmann, and Thomas Riechert:
 OntoWiki: A Tool for Social Semantic Collaboration In: Proceedings of the Workshop on
 Social and Collaborative Construction of Structured Knowledge CKC 2007 at the 16th
 International World Wide Web Conference WWW2007 Banff, Canada, May 8, 2007              6
1. Create Knowledge Base




                           7
2. Import a CSV file




                       8
3. Define dimensions




                       9
4. Define data range




                       10
5. Save template, extract triples




                                    11
6. Re-use template for similar files




                                       12
7. View resources




                    13
RDFized GHO data
gho:Country rdfs:subClassOf qb:DimensionProperty;
          rdf:type rdfs:Class;
          rdfs:label "Country" .

gho:Disease rdfs:subClassOf qb:DimensionProperty;
          rdf:type rdfs:Class;
          rdfs:label "Disease" .

gho: Afghanistan rdf:type ex:Country;
             rdfs:label "Afghanistan" .

gho:Tuberculosis rdf:type ex:Disease;
             rdfs:label "Tuberculosis" .


gho:c1-r6 rdf:type qb:Observation;
         rdf:value "127"^^xsd:integer;
         qb:dimension gho:Afghanistan;
         qb:dimension gho:Tuberculosis .
                                                    14
RDFized GHO data

• Available at http://gho.aksw.org
• 50 datasets
• ~ 8 million triples
• Paper published at SWJ Call for Dataset descriptions:
http://www.semantic-web-journal.net/content/publishing-and-
interlinking-global-health-observatory-dataset




                                                              15
Limitations and Future Work
 • Conversion

 • Coherence

 • Temporal Comparability

 • Exploring GHO




                              16
Thank You!

    Questions?

http://aksw.org/AmrapaliZaveri

zaveri@informatik.uni-leipzig.de



                                   17

Weitere ähnliche Inhalte

Was ist angesagt?

RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
National Institute of Informatics
 
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Alasdair Gray
 
Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...
Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...
Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...
ASIS&T
 
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and DistributionsThe HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
Alasdair Gray
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
Alasdair Gray
 
2010 06 rdf_next
2010 06 rdf_next2010 06 rdf_next
2010 06 rdf_next
Jun Zhao
 

Was ist angesagt? (20)

Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCID
 
Deriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataDeriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF Data
 
ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)ALEC (A List of Everything Cool)
ALEC (A List of Everything Cool)
 
Organising principles
Organising principlesOrganising principles
Organising principles
 
4-Managing CrossRef DOIs
4-Managing CrossRef DOIs4-Managing CrossRef DOIs
4-Managing CrossRef DOIs
 
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as KnowledgeRDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
RDF4U: RDF Graph Visualization by Interpreting Linked Data as Knowledge
 
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
Tutorial: Describing Datasets with the Health Care and Life Sciences Communit...
 
Supporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life SciencesSupporting Dataset Descriptions in the Life Sciences
Supporting Dataset Descriptions in the Life Sciences
 
Freedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations ariseFreedom for bibliographic references: OpenCitations arise
Freedom for bibliographic references: OpenCitations arise
 
Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...
Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...
Lightning Talk, Ransom: Making the Case for Interactive Data Transformation T...
 
Semantic Web Technology
Semantic Web TechnologySemantic Web Technology
Semantic Web Technology
 
1 bioline & t space or2013 final
1 bioline & t space or2013 final1 bioline & t space or2013 final
1 bioline & t space or2013 final
 
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
Getting the best of Linked Data and Property Graphs: rdf2neo and the KnetMine...
 
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and DistributionsThe HCLS Community Profile: Describing Datasets, Versions, and Distributions
The HCLS Community Profile: Describing Datasets, Versions, and Distributions
 
Towards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and WikidataTowards a Unified PageRank for DBpedia and Wikidata
Towards a Unified PageRank for DBpedia and Wikidata
 
Rapid Digitization of Latin American Ephemera with Hydra
Rapid Digitization of Latin American Ephemera with HydraRapid Digitization of Latin American Ephemera with Hydra
Rapid Digitization of Latin American Ephemera with Hydra
 
Linked Data, Ontologies and Inference
Linked Data, Ontologies and InferenceLinked Data, Ontologies and Inference
Linked Data, Ontologies and Inference
 
An Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland ProjectAn Identifier Scheme for the Digitising Scotland Project
An Identifier Scheme for the Digitising Scotland Project
 
2010 06 rdf_next
2010 06 rdf_next2010 06 rdf_next
2010 06 rdf_next
 
Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web
 

Ähnlich wie Converting GHO to RDF

The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
Sören Auer
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
National Information Standards Organization (NISO)
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
Dr.-Ing. Thomas Hartmann
 

Ähnlich wie Converting GHO to RDF (20)

The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
 
Semantic Web use cases in outcomes research
Semantic Web use cases in outcomes researchSemantic Web use cases in outcomes research
Semantic Web use cases in outcomes research
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Linked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcareLinked Data efforts for data standards in biopharma and healthcare
Linked Data efforts for data standards in biopharma and healthcare
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Linked Data
Linked DataLinked Data
Linked Data
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
 
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2
 
LOD技術解説
LOD技術解説LOD技術解説
LOD技術解説
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2ISWC GoodRelations Tutorial Part 2
ISWC GoodRelations Tutorial Part 2
 
GoodRelations Tutorial Part 2
GoodRelations Tutorial Part 2GoodRelations Tutorial Part 2
GoodRelations Tutorial Part 2
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Timbuctoo 2 EASY
Timbuctoo 2 EASYTimbuctoo 2 EASY
Timbuctoo 2 EASY
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 

Mehr von Amrapali Zaveri, PhD

Mehr von Amrapali Zaveri, PhD (16)

Data Quality and the FAIR principles
Data Quality and the FAIR principlesData Quality and the FAIR principles
Data Quality and the FAIR principles
 
Workshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in WikidataWorkshop on Data Quality Management in Wikidata
Workshop on Data Quality Management in Wikidata
 
ESOF Panel 2018
ESOF Panel 2018ESOF Panel 2018
ESOF Panel 2018
 
CrowdED: Guideline for optimal Crowdsourcing Experimental Design
CrowdED: Guideline for optimal Crowdsourcing Experimental DesignCrowdED: Guideline for optimal Crowdsourcing Experimental Design
CrowdED: Guideline for optimal Crowdsourcing Experimental Design
 
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality AssessmentMetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
MetaCrowd: Crowdsourcing Gene Expression Metadata Quality Assessment
 
smartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIssmartAPI: Towards a more intelligent network of Web APIs
smartAPI: Towards a more intelligent network of Web APIs
 
Introduction to Bio SPARQL
Introduction to Bio SPARQL Introduction to Bio SPARQL
Introduction to Bio SPARQL
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
Linked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A SurveyLinked Data Quality Assessment: A Survey
Linked Data Quality Assessment: A Survey
 
Amrapali Zaveri Defense
Amrapali Zaveri DefenseAmrapali Zaveri Defense
Amrapali Zaveri Defense
 
LDQ 2014 DQ Methodology
LDQ 2014 DQ MethodologyLDQ 2014 DQ Methodology
LDQ 2014 DQ Methodology
 
LOD-SEM
LOD-SEMLOD-SEM
LOD-SEM
 
TripleCheckMate
TripleCheckMateTripleCheckMate
TripleCheckMate
 
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of CognitionTowards Biomedical Data Integration for Analyzing the Evolution of Cognition
Towards Biomedical Data Integration for Analyzing the Evolution of Cognition
 
User-driven Quality Evaluation of DBpedia
User-driven Quality Evaluation of DBpediaUser-driven Quality Evaluation of DBpedia
User-driven Quality Evaluation of DBpedia
 
ReDD-Observatory
ReDD-ObservatoryReDD-Observatory
ReDD-Observatory
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Converting GHO to RDF

  • 1. Converting WHO’s Global Health Observatory Data to RDF Amrapali Zaveri, PhD student August  27,  2012 1
  • 2. Outline • Background • What is the RDF Data Cube Vocabulary? • Semi-automated approach • OntoWiki's CSVImport plug-in • RDFized GHO data • Limitations and Future Work 2
  • 3. Background • Biomedical statistical data • Published as Excel sheets • Advantage • Readable by humans • Disadvantages • Cannot be queried efficiently • Difficult to integrate with other data (in different formats) • Our approach • Converting data into a single data model - RDF • Using the RDF Data Cube Vocabulary* • designed particularly to represent multidimensional statistical data using RDF. *http://www.w3.org/TR/vocab-data-cube/ 3
  • 4. What is the RDF Data Cube Vocabulary? 4
  • 5. What is the RDF Data Cube Vocabulary? • Dimensions • Attributes • Measures • Observations 5
  • 6. Semi-automated approach • Transforming CSV to RDF in a fully automated way is not feasible. • Dimensions may often be encoded in heading or label of a sheet • Our semi-automatic approach: • As a plug-in in OntoWiki# • a semantic collaboration platform developed by the AKSW research group. • A CSV file is converted into RDF using the RDF Data Cube Vocabulary: http://aksw.org/Projects/ Stats2RDF # Sören Auer, Sebastian Tramp (geb. Dietzold), Jens Lehmann, and Thomas Riechert: OntoWiki: A Tool for Social Semantic Collaboration In: Proceedings of the Workshop on Social and Collaborative Construction of Structured Knowledge CKC 2007 at the 16th International World Wide Web Conference WWW2007 Banff, Canada, May 8, 2007 6
  • 8. 2. Import a CSV file 8
  • 10. 4. Define data range 10
  • 11. 5. Save template, extract triples 11
  • 12. 6. Re-use template for similar files 12
  • 14. RDFized GHO data gho:Country rdfs:subClassOf qb:DimensionProperty; rdf:type rdfs:Class; rdfs:label "Country" . gho:Disease rdfs:subClassOf qb:DimensionProperty; rdf:type rdfs:Class; rdfs:label "Disease" . gho: Afghanistan rdf:type ex:Country; rdfs:label "Afghanistan" . gho:Tuberculosis rdf:type ex:Disease; rdfs:label "Tuberculosis" . gho:c1-r6 rdf:type qb:Observation; rdf:value "127"^^xsd:integer; qb:dimension gho:Afghanistan; qb:dimension gho:Tuberculosis . 14
  • 15. RDFized GHO data • Available at http://gho.aksw.org • 50 datasets • ~ 8 million triples • Paper published at SWJ Call for Dataset descriptions: http://www.semantic-web-journal.net/content/publishing-and- interlinking-global-health-observatory-dataset 15
  • 16. Limitations and Future Work • Conversion • Coherence • Temporal Comparability • Exploring GHO 16
  • 17. Thank You! Questions? http://aksw.org/AmrapaliZaveri zaveri@informatik.uni-leipzig.de 17