SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Building a structured catalog for educational
datasets
Stefan Dietze
04/07/13 1Stefan Dietze
Linked Open (educational) Data
 LOD: 300+ datasets, 32 billion
distinct RDF statements
 DataHub: 6000+ open datasets
2
 LinkedUp: FP7-ICT-2012-8, CSA
(http://linkedup-project.eu)
 Goal: enabling large-scale take-up of (Linked) Open Data
(education as application context)
Linked Open (educational) Data
 LOD: 300+ datasets, 32 billion
distinct RDF statements
 DataHub: 6000+ open datasets
http://datahub.io/dataset/bbc
60.000.000 triples
Using/exploiting Linked Data in Education ?
 Lack of reliable dataset metadata about
 Resource types
 Topics & disciplines
 Quality, currentness & availability
 Provenance
 Lack of links and cross-dataset references
 Lack of scalable query methods
Example dataset
description
3
04/07/13 4Stefan Dietze
Linked Data „Observatory“ – Processing Chain
Endpoint Retrieval
& Graph
Extraction
Schema
Extraction and
Mapping
Sample Graph
Extraction
(per dataset)
NER & NED
(per resource)
Interlinking & Co-
Resolution
(cross-dataset)
Category Mapping,
Normalisation,
Filtering
Dataset
Catalog/Index
Links/
Cross-references
rdfs:label:„…ECB….“
?
Dataset metadata (RDF/VoID):
 Schema mappings
(types, properties)
 Entities & categories
 Topic relevance scores
 Availability, currentness
data (tbc)
dbpedia:Finance
dbpedia:Sports
dbpedia:England-Wales-Cricket-Board
dbpedia:European_Central_Bank
Goals:
 RDF catalog of datasets
dataset of datasets
(classification of datasets
according to, eg,
represented types,
disciplines/topics, data
quality, accessability)
 Links and coreferences =>
unified view on data =>
Linked Education Graph
 Infrastructure & APIs for
federated queries
04/07/13 5Stefan Dietze
Linked Data „Observatory“ – Processing Chain
Endpoint Retrieval
& Graph
Extraction
Schema
Extraction and
Mapping
Sample Graph
Extraction
(per dataset)
NER & NED
(per resource)
Interlinking & Co-
Resolution
(cross-dataset)
Category Mapping,
Normalisation,
Filtering
Dataset
Catalog/Index
Links/
Cross-references
rdfs:label:„…ECB….“
?
Dataset metadata (RDF/VoID):
 Schema mappings
(types, properties)
 Entities & categories
 Topic relevance scores
 Availability, currentness
data (tbc)
dbpedia:Finance
dbpedia:Sports
dbpedia:England-Wales-Cricket-Board
dbpedia:European_Central_Bank
Assessing the Educational Linked Data
Landscape, D’Aquin, M., Adamou, A.,
Dietze, S., ACM Web Science 2013
(WebSci2013), Paris, France, May 2013.
Complex Matching of RDF Datatype
Properties, Nunes, B. P., Mera, A.,
Casanova, M. A., Fetahu, B., Paes Leme, L.
Dietze, S., 24th International Conference on
Database and Expert Systems Applications
– DEXA 2013, August 2013, Prague, CR.
Combining a co-occurrence-based and a
semantic measure for entity linking, B. P.
Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl. , ESWC
2013 - 10th Extended Semantic Web
Conference, (May 2013).
Indexing of Linked Data, What’s all the
data about, Fetahu, B; Adamou, A., Dietze,
S., d’Aquin, M., Nunes, B.P., ISWC2013 –
12th International Semantic Web
Conference; under review.
A Probabilistic Scheme for Keyword-
Based Incremental Query Construction.,
Demidova, E., Zhou, X, Nejdl, W., IEEE
Transactions on Knowledge and Data
Engineering, 24(3):426-439, 2012.
[DEXA13]
[WEBSCI13]
[ESWC13]
[ISWC13?]
[TKDE12]
04/07/13 6Stefan Dietze
<yov:Lecture8748720>
<yov:title>Pluto & the
Dwarf Planets</yov:title>
…
< yov:Lecture8748720>
Online Lecture
<ss:SlideSet-2139393292>
<title>Planetary motion
& gravity</title>
…
</ss:Slideset-2139393292>
Lecture Slideset
Relatedness of resources/entities?
(types, semantics)
Metadata about datasets?
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Video Documentary
Assessing the Educational Linked Data Landscape,
D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science
2013 (WebSci2013), Paris, France, May 2013.
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
Challenge: data heterogeneity
04/07/13 7Stefan Dietze
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
Data disambiguation, linking & annotation
<yov:Lecture8748720>
<yov:title>Pluto & the
Dwarf Planets</yov:title>
…
< yov:Lecture8748720>
Online Lecture
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Brian Cox?
Sun?
Pluto?
Video Documentary
db:Pluto
(Dwarf Planet)
db:Astrono-
mical Objects
db:Sun
04/07/13 8Stefan Dietze
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
Data disambiguation, linking & annotation
db:Astronomy
<yov:Lecture8748720>
<yov:title>Pluto & the
Dwarf Planets</yov:title>
…
< yov:Lecture8748720>
Online Lecture
<ss:SlideSet-2139393292>
<title>Planetary motion
& gravity</title>
…
</ss:Slideset-2139393292>
Lecture Slideset
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Video Documentary
db:Pluto
(Dwarf Planet)
db:Astrono-
mical Objects
04/07/13 9Stefan Dietze
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
Data disambiguation, linking & annotation
<yov:Lecture8748720>
<title>Pluto & the Dwarf
Planets</title>
…
< yov:Lecture8748720>
Online Lecture
db:Astronomy
 Computation of connectivity scores
between resources/entities
 Method: combination of a
 (i) semantic (graph-based) connectivity
score (SCS) with
 (ii) a Web co-occurence-based measure
(CBM) (similar to NGD)
 For (i): adaptation of Katz-Index from SNA
for (linked) data graphs (considering path
number and path lengths of transversal
properties)
Data linking
Dataset categorisation: computation of
normalised (DBpedia) category relevance
scores for datasets
db:Sun
SCS = 0.32
CBM = 0.24
<ss:SlideSet-2139393292>
<title>Planetary motion
& gravity</title>
…
</ss:Slideset-2139393292>
Lecture Slideset
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Video Documentary
Data disambiguation, linking & annotation
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
04/07/13 10Stefan Dietze
 Evaluation based on USA Today News items (80.000 entity pairs)
 Manually created gold standard
(1000 entity pairs)
 Baseline: Explicit Semantic Analysis (ESA)
=> CBM/SCS: „relatedness“; ESA: „similarity“
Precision/Recall/F1 for SCS, CBM, ESA.
Enhanced dataset descriptions
on the DataHub
Dataset RDF graph: correlations
based on semantic annotations (categories)
Dataset classification: expanded dataset catalog & graph
04/07/13 11Stefan Dietze
http://linkedup-project.eu
http://data.linkededucation.org/linkedup/catalog/
Assessing the Educational Linked Data Landscape,
D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science
2013 (WebSci2013), Paris, France, May 2013.
04/07/13 12Stefan Dietze
Thank you!
http://purl.org/dietze

Weitere ähnliche Inhalte

Was ist angesagt?

Combining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingCombining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linking
Besnik Fetahu
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
EUCLID project
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
Stefan Dietze
 

Was ist angesagt? (20)

Combining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linkingCombining a co-occurrence-based and a semantic measure for entity linking
Combining a co-occurrence-based and a semantic measure for entity linking
 
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the WebRetrieval, Crawling and Fusion of Entity-centric Data on the Web
Retrieval, Crawling and Fusion of Entity-centric Data on the Web
 
Online Learning and Linked Data: An Introduction
Online Learning and Linked Data: An IntroductionOnline Learning and Linked Data: An Introduction
Online Learning and Linked Data: An Introduction
 
Mining and Understanding Activities and Resources on the Web
Mining and Understanding Activities and Resources on the WebMining and Understanding Activities and Resources on the Web
Mining and Understanding Activities and Resources on the Web
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
 
LinkedUp - Linked Data & Education
LinkedUp - Linked Data & EducationLinkedUp - Linked Data & Education
LinkedUp - Linked Data & Education
 
Seminario Sobre Datasets Consorcio Madrono
Seminario Sobre Datasets Consorcio Madrono Seminario Sobre Datasets Consorcio Madrono
Seminario Sobre Datasets Consorcio Madrono
 
Semantic Web / Linked Data Technologies
Semantic Web / Linked Data TechnologiesSemantic Web / Linked Data Technologies
Semantic Web / Linked Data Technologies
 
Semantic Web, Linked Data and Education: A Perfect Fit?
Semantic Web, Linked Data and Education: A Perfect Fit?Semantic Web, Linked Data and Education: A Perfect Fit?
Semantic Web, Linked Data and Education: A Perfect Fit?
 
Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...Linked Data at the Open University: From Technical Challenges to Organization...
Linked Data at the Open University: From Technical Challenges to Organization...
 
Interpreting Data Mining Results with Linked Data for Learning Analytics
Interpreting Data Mining Results with Linked Data for Learning AnalyticsInterpreting Data Mining Results with Linked Data for Learning Analytics
Interpreting Data Mining Results with Linked Data for Learning Analytics
 
Experience from 10 months of University Linked Data
Experience from 10 months of University Linked Data Experience from 10 months of University Linked Data
Experience from 10 months of University Linked Data
 
Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...Why should semantic technologies pay more attention to privacy... and vice-ve...
Why should semantic technologies pay more attention to privacy... and vice-ve...
 
Working with data.open.ac.uk, the Linked Data Platform of the Open University
Working with data.open.ac.uk, the Linked Data Platform of the Open UniversityWorking with data.open.ac.uk, the Linked Data Platform of the Open University
Working with data.open.ac.uk, the Linked Data Platform of the Open University
 
Data4Ed - How data sharing, curation and analytics support innovation in educ...
Data4Ed - How data sharing, curation and analytics support innovation in educ...Data4Ed - How data sharing, curation and analytics support innovation in educ...
Data4Ed - How data sharing, curation and analytics support innovation in educ...
 
Doing Clever Things with the Semantic Web
Doing Clever Things with the Semantic WebDoing Clever Things with the Semantic Web
Doing Clever Things with the Semantic Web
 
Analysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the WebAnalysing & Improving Learning Resources Markup on the Web
Analysing & Improving Learning Resources Markup on the Web
 
Linked Data Approach for Integration of Human Health & Environmental Data
Linked Data Approach for Integration of Human Health & Environmental DataLinked Data Approach for Integration of Human Health & Environmental Data
Linked Data Approach for Integration of Human Health & Environmental Data
 
DMPTool webinar 2011-10-19
DMPTool webinar 2011-10-19DMPTool webinar 2011-10-19
DMPTool webinar 2011-10-19
 
Linked Data as a new environment for Learning Analytics and education
Linked Data as a new environment  for Learning Analytics and educationLinked Data as a new environment  for Learning Analytics and education
Linked Data as a new environment for Learning Analytics and education
 

Andere mochten auch

Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Stefan Dietze
 

Andere mochten auch (7)

Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014Open Data & Education Seminar, ITMO, St Petersburg, March 2014
Open Data & Education Seminar, ITMO, St Petersburg, March 2014
 
LinkedUp Open Education Panel session
LinkedUp Open Education Panel sessionLinkedUp Open Education Panel session
LinkedUp Open Education Panel session
 
LinkedUp Project
LinkedUp ProjectLinkedUp Project
LinkedUp Project
 
Open Education and Open Development – working together
Open Education and Open Development – working togetherOpen Education and Open Development – working together
Open Education and Open Development – working together
 
LinkedUp ESWC poster
LinkedUp ESWC posterLinkedUp ESWC poster
LinkedUp ESWC poster
 
Final pink panthers_03_31
Final pink panthers_03_31Final pink panthers_03_31
Final pink panthers_03_31
 
B2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public SectorB2: Open Up: Open Data in the Public Sector
B2: Open Up: Open Data in the Public Sector
 

Ähnlich wie A structured catalog of open educational datasets

Web Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic WebWeb Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
Stefan Dietze
 
LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014
Stefan Dietze
 
MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015
CEPHAS MAWERE
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Natsuko Nicholls
 
IDs书友会 - 主题1 - Swinburne Next Generation Research
IDs书友会 - 主题1 - Swinburne Next Generation Research IDs书友会 - 主题1 - Swinburne Next Generation Research
IDs书友会 - 主题1 - Swinburne Next Generation Research
IDs Club 澳洲互联网俱乐部
 
Open Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationOpen Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in Education
Stefan Dietze
 

Ähnlich wie A structured catalog of open educational datasets (20)

From Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web DatasetsFrom Data to Knowledge - Profiling & Interlinking Web Datasets
From Data to Knowledge - Profiling & Interlinking Web Datasets
 
What's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked DatasetsWhat's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked Datasets
 
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic WebWeb Science Synergies: Exploring Web Knowledge through the Semantic Web
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
 
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the WebBeyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
Beyond Linked Data - Exploiting Entity-Centric Knowledge on the Web
 
Camp 4-data workshop presentation
Camp 4-data workshop presentationCamp 4-data workshop presentation
Camp 4-data workshop presentation
 
LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014LinkedUp - Linked Data Europe Workshop 2014
LinkedUp - Linked Data Europe Workshop 2014
 
Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital Libraries
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015MawereC- Ubuntunet paper publication 2015
MawereC- Ubuntunet paper publication 2015
 
Data storage in Cloud computing
Data storage in Cloud computingData storage in Cloud computing
Data storage in Cloud computing
 
Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)Linked Data for Architecture, Engineering and Construction (AEC)
Linked Data for Architecture, Engineering and Construction (AEC)
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
 
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
 
British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
IDs书友会 - 主题1 - Swinburne Next Generation Research
IDs书友会 - 主题1 - Swinburne Next Generation Research IDs书友会 - 主题1 - Swinburne Next Generation Research
IDs书友会 - 主题1 - Swinburne Next Generation Research
 
Towards research data knowledge graphs
Towards research data knowledge graphsTowards research data knowledge graphs
Towards research data knowledge graphs
 
Open Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in EducationOpen Data Dialog 2013 - Linked Data in Education
Open Data Dialog 2013 - Linked Data in Education
 
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
 

Mehr von Stefan Dietze

Mehr von Stefan Dietze (13)

AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...AI in between online and offline discourse - and what has ChatGPT to do with ...
AI in between online and offline discourse - and what has ChatGPT to do with ...
 
An interdisciplinary journey with the SAL spaceship – results and challenges ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...An interdisciplinary journey with the SAL spaceship – results and challenges ...
An interdisciplinary journey with the SAL spaceship – results and challenges ...
 
Research Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at NFDI4DS & GESISResearch Knowledge Graphs at NFDI4DS & GESIS
Research Knowledge Graphs at NFDI4DS & GESIS
 
Research Knowledge Graphs at GESIS & NFDI4DataScience
Research Knowledge Graphs at GESIS & NFDI4DataScienceResearch Knowledge Graphs at GESIS & NFDI4DataScience
Research Knowledge Graphs at GESIS & NFDI4DataScience
 
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
Human-in-the-loop: the Web as Foundation for interdisciplinary Data Science M...
 
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
Human-in-the-Loop: das Web als Grundlage interdisziplinärer Data Science Meth...
 
Beyond research data infrastructures: exploiting artificial & crowd intellige...
Beyond research data infrastructures: exploiting artificial & crowd intellige...Beyond research data infrastructures: exploiting artificial & crowd intellige...
Beyond research data infrastructures: exploiting artificial & crowd intellige...
 
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
From Web Data to Knowledge: on the Complementarity of Human and Artificial In...
 
Using AI to understand everyday learning on the Web
Using AI to understand everyday learning on the WebUsing AI to understand everyday learning on the Web
Using AI to understand everyday learning on the Web
 
Analysing User Knowledge, Competence and Learning during Online Activities
Analysing User Knowledge, Competence and Learning during Online ActivitiesAnalysing User Knowledge, Competence and Learning during Online Activities
Analysing User Knowledge, Competence and Learning during Online Activities
 
Big Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday LearningBig Data in Learning Analytics - Analytics for Everyday Learning
Big Data in Learning Analytics - Analytics for Everyday Learning
 
Towards embedded Markup of Learning Resources on the Web
Towards embedded Markup of Learning Resources on the WebTowards embedded Markup of Learning Resources on the Web
Towards embedded Markup of Learning Resources on the Web
 
Dietze linked data-vr-es
Dietze linked data-vr-esDietze linked data-vr-es
Dietze linked data-vr-es
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

A structured catalog of open educational datasets

  • 1. Building a structured catalog for educational datasets Stefan Dietze 04/07/13 1Stefan Dietze
  • 2. Linked Open (educational) Data  LOD: 300+ datasets, 32 billion distinct RDF statements  DataHub: 6000+ open datasets 2  LinkedUp: FP7-ICT-2012-8, CSA (http://linkedup-project.eu)  Goal: enabling large-scale take-up of (Linked) Open Data (education as application context)
  • 3. Linked Open (educational) Data  LOD: 300+ datasets, 32 billion distinct RDF statements  DataHub: 6000+ open datasets http://datahub.io/dataset/bbc 60.000.000 triples Using/exploiting Linked Data in Education ?  Lack of reliable dataset metadata about  Resource types  Topics & disciplines  Quality, currentness & availability  Provenance  Lack of links and cross-dataset references  Lack of scalable query methods Example dataset description 3
  • 4. 04/07/13 4Stefan Dietze Linked Data „Observatory“ – Processing Chain Endpoint Retrieval & Graph Extraction Schema Extraction and Mapping Sample Graph Extraction (per dataset) NER & NED (per resource) Interlinking & Co- Resolution (cross-dataset) Category Mapping, Normalisation, Filtering Dataset Catalog/Index Links/ Cross-references rdfs:label:„…ECB….“ ? Dataset metadata (RDF/VoID):  Schema mappings (types, properties)  Entities & categories  Topic relevance scores  Availability, currentness data (tbc) dbpedia:Finance dbpedia:Sports dbpedia:England-Wales-Cricket-Board dbpedia:European_Central_Bank Goals:  RDF catalog of datasets dataset of datasets (classification of datasets according to, eg, represented types, disciplines/topics, data quality, accessability)  Links and coreferences => unified view on data => Linked Education Graph  Infrastructure & APIs for federated queries
  • 5. 04/07/13 5Stefan Dietze Linked Data „Observatory“ – Processing Chain Endpoint Retrieval & Graph Extraction Schema Extraction and Mapping Sample Graph Extraction (per dataset) NER & NED (per resource) Interlinking & Co- Resolution (cross-dataset) Category Mapping, Normalisation, Filtering Dataset Catalog/Index Links/ Cross-references rdfs:label:„…ECB….“ ? Dataset metadata (RDF/VoID):  Schema mappings (types, properties)  Entities & categories  Topic relevance scores  Availability, currentness data (tbc) dbpedia:Finance dbpedia:Sports dbpedia:England-Wales-Cricket-Board dbpedia:European_Central_Bank Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. Complex Matching of RDF Datatype Properties, Nunes, B. P., Mera, A., Casanova, M. A., Fetahu, B., Paes Leme, L. Dietze, S., 24th International Conference on Database and Expert Systems Applications – DEXA 2013, August 2013, Prague, CR. Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl. , ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). Indexing of Linked Data, What’s all the data about, Fetahu, B; Adamou, A., Dietze, S., d’Aquin, M., Nunes, B.P., ISWC2013 – 12th International Semantic Web Conference; under review. A Probabilistic Scheme for Keyword- Based Incremental Query Construction., Demidova, E., Zhou, X, Nejdl, W., IEEE Transactions on Knowledge and Data Engineering, 24(3):426-439, 2012. [DEXA13] [WEBSCI13] [ESWC13] [ISWC13?] [TKDE12]
  • 6. 04/07/13 6Stefan Dietze <yov:Lecture8748720> <yov:title>Pluto & the Dwarf Planets</yov:title> … < yov:Lecture8748720> Online Lecture <ss:SlideSet-2139393292> <title>Planetary motion & gravity</title> … </ss:Slideset-2139393292> Lecture Slideset Relatedness of resources/entities? (types, semantics) Metadata about datasets? <po:Programme519215> <po:Series>Wonders of the Solar System</po:Series> <po:Episode>Emp. of the Sun</po:Episode> <po:Actor>Brian Cox</po:Actor> </po:Programme519215 > Video Documentary Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013. Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). Challenge: data heterogeneity
  • 7. 04/07/13 7Stefan Dietze Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). Data disambiguation, linking & annotation <yov:Lecture8748720> <yov:title>Pluto & the Dwarf Planets</yov:title> … < yov:Lecture8748720> Online Lecture <po:Programme519215> <po:Series>Wonders of the Solar System</po:Series> <po:Episode>Emp. of the Sun</po:Episode> <po:Actor>Brian Cox</po:Actor> </po:Programme519215 > Brian Cox? Sun? Pluto? Video Documentary
  • 8. db:Pluto (Dwarf Planet) db:Astrono- mical Objects db:Sun 04/07/13 8Stefan Dietze Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). Data disambiguation, linking & annotation db:Astronomy <yov:Lecture8748720> <yov:title>Pluto & the Dwarf Planets</yov:title> … < yov:Lecture8748720> Online Lecture <ss:SlideSet-2139393292> <title>Planetary motion & gravity</title> … </ss:Slideset-2139393292> Lecture Slideset <po:Programme519215> <po:Series>Wonders of the Solar System</po:Series> <po:Episode>Emp. of the Sun</po:Episode> <po:Actor>Brian Cox</po:Actor> </po:Programme519215 > Video Documentary
  • 9. db:Pluto (Dwarf Planet) db:Astrono- mical Objects 04/07/13 9Stefan Dietze Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). Data disambiguation, linking & annotation <yov:Lecture8748720> <title>Pluto & the Dwarf Planets</title> … < yov:Lecture8748720> Online Lecture db:Astronomy  Computation of connectivity scores between resources/entities  Method: combination of a  (i) semantic (graph-based) connectivity score (SCS) with  (ii) a Web co-occurence-based measure (CBM) (similar to NGD)  For (i): adaptation of Katz-Index from SNA for (linked) data graphs (considering path number and path lengths of transversal properties) Data linking Dataset categorisation: computation of normalised (DBpedia) category relevance scores for datasets db:Sun SCS = 0.32 CBM = 0.24 <ss:SlideSet-2139393292> <title>Planetary motion & gravity</title> … </ss:Slideset-2139393292> Lecture Slideset <po:Programme519215> <po:Series>Wonders of the Solar System</po:Series> <po:Episode>Emp. of the Sun</po:Episode> <po:Actor>Brian Cox</po:Actor> </po:Programme519215 > Video Documentary
  • 10. Data disambiguation, linking & annotation Combining a co-occurrence-based and a semantic measure for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May 2013). 04/07/13 10Stefan Dietze  Evaluation based on USA Today News items (80.000 entity pairs)  Manually created gold standard (1000 entity pairs)  Baseline: Explicit Semantic Analysis (ESA) => CBM/SCS: „relatedness“; ESA: „similarity“ Precision/Recall/F1 for SCS, CBM, ESA.
  • 11. Enhanced dataset descriptions on the DataHub Dataset RDF graph: correlations based on semantic annotations (categories) Dataset classification: expanded dataset catalog & graph 04/07/13 11Stefan Dietze http://linkedup-project.eu http://data.linkededucation.org/linkedup/catalog/ Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013.
  • 12. 04/07/13 12Stefan Dietze Thank you! http://purl.org/dietze