Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Commercially empowered Linked Open DataEcosystems in Research           Towards unfolding todays and tomorrows           s...
nani gigantum humeris insidentes   Standing on the shouldes of giants     – Research builds on the past     – We pass on ...
Lying under a pile of text documents   .. with varying quality   .. with contradicting facts   .. with missing data   ...
Yes, we (think) we can...   Make Facts and Figures explicit, discoveralbe and comparable   Giving textually enCODED scie...
That‘s nice, but how?      Extract                                                          Analyse &                     ...
Extract & Integrate: Approach and Challenges   Extracting Structural Elements     – Tables     – Figures     – Sections a...
Extract & Integrate: Example                               Numerical Facts                                 Dimension/     ...
Extract & Integrate: Current Status                                                                                    Te...
Aggregate: Approach and Challenges   Representation and Storage     – Representation using the RDF Data Cube Vocabulary  ...
Aggregate: Current Status   Representation and Storage     – Data Model implemented     – Triplification of Benchmarking ...
Analyse: Approach and Challenges   Visual Analytics for Linked Scientific Facts     – RDF based description of visualisat...
Share: Approach and Challenges   Provenance     – Who published data?     – Who modified data?   Share aggregated data s...
Why should YOU do it?Marketplace concept for research data Users (=researchers) will be enabled to “sell” their analysis ...
integrate    crowdsource      extract &                      organise      visualise Find us, join us, ask us, help us    ...
Nächste SlideShare
Wird geladen in …5
×

I-Know presentation: CODE - Commerically empowered Linked Open Data Ecosystems in Research

2.119 Aufrufe

Veröffentlicht am

Invited talk i gave at I-Know on our recently started FP 7 Project CODE (http://code-research.eu/)

Veröffentlicht in: Technologie, Bildung
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

I-Know presentation: CODE - Commerically empowered Linked Open Data Ecosystems in Research

  1. 1. Commercially empowered Linked Open DataEcosystems in Research Towards unfolding todays and tomorrows scientific treasures Michael Granitzer University of Passau FP 7 Strep No. 296150 1
  2. 2. nani gigantum humeris insidentes Standing on the shouldes of giants – Research builds on the past – We pass on knowledge, to create new knowledge Root of (Western) Society 2
  3. 3. Lying under a pile of text documents .. with varying quality .. with contradicting facts .. with missing data .. labour intensive to compare results Some examples – “Improvements that don’t add up” Armstrong et. al. 2009 – “Why most research results are false” Ioannidis, 2005 Can we do better? 3
  4. 4. Yes, we (think) we can... Make Facts and Figures explicit, discoveralbe and comparable Giving textually enCODED scientific knowledge, we can – Extract facts from research papers – Integrate those facts with existing knowledge – Make it available for (visual) analysis – Crowdsource Focus on – Empirical observations/facts – Linked Open Data – Computer Science and Biomedical Domain 4
  5. 5. That‘s nice, but how? Extract Analyse & Share & Aggregate & Integrate Organise Commercialise Dependency and Frequency Analysis Graph Depencies Machine Algorithm Learning CRF SVM Biomedical Data Set 1 Gesamtergebnis" Algorithms" (Leer)" SVM" Domain" DataSet2" Experiment" DataSet1" CRF" (Leer)" Biomedical" Gesamtergebnis" 0" 5" 10" 15" 20"Text, Linked Data Linked Scientific Fact Visual Analytics & Crowdsourcing & Experiments Data Warehouse Collaborative Marketplace mind-mapping 5
  6. 6. Extract & Integrate: Approach and Challenges Extracting Structural Elements – Tables – Figures – Sections and sub-sections Extracting Facts from Structural Elements – Entity extraction (e.g. algorithms, data sets, genes, significance levels etc.) – Fact extraction – <Entity, Relation, Measure> – Table Triplification Crowdsourcing Extraction – Extraction quality and domain knowledge remains a key issue  Empower users to maintain their own extraction model  Allow to semantically annotate research papers (e.g. entities, facts) Result: Semantically annotated scientific data as LOD Endpoint 6
  7. 7. Extract & Integrate: Example Numerical Facts Dimension/ Entity In-Document Context Ranking Facts 7
  8. 8. Extract & Integrate: Current Status  TeamBeam -PDF Structure Extraction – Structural elements – Focusing now on tables  Entity Extraction in work  First Prototypes for Table2RDFDataCube TeamBeam — Meta-Data Extraction from Scientific Literature By Roman Kern, Graz University of Technology; Kris Jack and Maya Hristakeva, Mendeley Ltd.; Michael Granitzer, University of Passau 8
  9. 9. Aggregate: Approach and Challenges Representation and Storage – Representation using the RDF Data Cube Vocabulary • Dimensions (e.g. Algorithms, Genes) • Measures (e.g. 0.3, 37) and Attributes (e.g. %, °) – Challenge 1: Ensure independency of dimensions – Challenge 2: Decentralized querying and aggregation http://www.w3.org/TR/vocab-data-cube/#ref_qb_measureType SPARQL Data Warehousing Wizard – Provide simple and intuitive Wizard for creating aggregation queries • Google-like starting point • Pivot table creation similar like in Spreadsheets – Store using RDF Data Cube Vocabulary Linked Scientific Fact Data Warehouse for non-IT Experts 9
  10. 10. Aggregate: Current Status Representation and Storage – Data Model implemented – Triplification of Benchmarking Data (e.g. CLEF, TPC-H etc.) We are looking for data SPARQL Data Warehousing Wizard 10
  11. 11. Analyse: Approach and Challenges Visual Analytics for Linked Scientific Facts – RDF based description of visualisations • Glue between data and single visualisations • Make visualisation state explicit • Share visualisation state – HTML 5 based visualisations and visualisation wizard 11
  12. 12. Share: Approach and Challenges Provenance – Who published data? – Who modified data? Share aggregated data sets and annotation models – Build on insights created by others – Re-use text annotation models Share visual analytics applications – Simple visualisations might be misleading – Sharing whole states of a visual analysis will reveal more details on certain decisions 12
  13. 13. Why should YOU do it?Marketplace concept for research data Users (=researchers) will be enabled to “sell” their analysis results (or give it away for free) Serveral concepts to be investigated: Revenue chains, roles, models (donations, paid subscription for data feeds, purchase etc.) Increased opportunities for researchers and research data 13
  14. 14. integrate crowdsource extract & organise visualise Find us, join us, ask us, help us http://code-research.eu/http://www.facebook.com/CODEresearchEU #CODEresearchEU

×