Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015

•Download as PPTX, PDF•

1 like•472 views

Presentation of my work "Using Linked Data Traversal to Label Academic Communities" at the SAVE-SD workshop, co-located with the 24th International World Wide Web Conference at Florence, Italy

Presentations & Public Speaking

Using Linked Data Traversal to
Label Academic Communities
Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta
Knowledge Media Institute, The Open University

Motivation
We
• Explain data patterns automatically
• Using Linked Data background knowledge
Scholarly data
• Growing interest and techniques
• Mine and visualise data
• Reveal hidden knowledge
• Forecast
Data interpretation still manual

Use-case: Community Detection
Aim
• Detecting communities of research topics
• The Open University papers (ORO1)
Usual text-mining methods
• Groups of similar documents
• Probabilistically extracted topics
• Based on words of co-occurrence
1http://oro.open.ac.uk/

Use-case: Community Detection
Problem
Labeling require human interpretation

Linked Data can help!
• Scholarly data: big portion within Linked Data
• RDF structure (machine understandable)
• Linked datasets
• Across disciplines
• Easier discovery of unrevealed knowledge
• Easier result interpretation

Proposition
• Automatic topic detection (labels)
• With Linked Data background knowledge
• Machine Learning approach
• A* search over the Linked Data graph
• Link traversal (vs. literature based on SPARQL)

Approach
Document clustering
• text pre-processing (normalise, stem, filter)
• Latent Semantic Analysis space of word vectors
• clustering according to LSA distance
• community : a group of similar words
Communities networking
• connecting clusters’ centroids (the closest one)
• network graph of communities

Initial dataset
• Words URIs
• connected to DBpedia
Machine Learning/Logic Programming approach
• Given
• Positive examples E+ : Cluster (words) to label
• Negative examples E-: Words not in E+
• Background Knowledge from Linked Data
• Derive
• Explanations of the grouping for E+ (topic)
Approach

Explanation
• RDF property chains
• Leading to the same
entity
• shared by a subset of
initial words
Linked Data Background Knowledge
Topic: many words of the cluster that share the
same explanation

Aim: find the explanation shared by the biggest
number of words in the cluster
Linked Data Traversal
e.g. <skos:relatedMatch-dc:subject-skos:broader.db:Creativity>

How: A* search to iteratively explore new parts
of the graph and improve the explanation
Linked Data Traversal
<skos:relatedMatch-dc:subject-skos:broader-skos:broader.db:Aesthetics>

Ranking explanations according to F-Measure
Take the best explanation and label the cluster
Explanation Evaluation
word outside E+
words
sharing
the
explanation
cluster
(E+)

Community Labeling
Examples of topics:
<skos:relatedMatch-dc:subject-skos:broader-skos:broader-skos:broader.db:Geology>
<skos:relatedMatch-dc:subject-skos:broader-skos:broader-skos:broader.db:Chemistry>
<skos:relatedMatch-dc:subject-skos:broader-skos:broader-skos:broader.db:Mathematics>

Conclusion and future work
Facilitating data interpretation by combining
• scholarly data
• Machine Learning
• Linked Data graph search
Future work
• improve the graph exploration to discover
more knowledge
• focus on the definition of “explanation”

Thank you! Questions?
Many thanks to him and him

What's hot

Lotus: Linked Open Text UnleaShed - ISWC COLD '15Filip Ilievski

Basic of trees 2Rajendran

VALA 2016 L-Plate session on Linked Open DataPeter Neish

Linked DataAngelica Lo Duca

Linked Data: A short(-ish) introductionPete Johnston

ESWC 2011 - Designing an Ontology for the Data Documentation InitiativeDr.-Ing. Thomas Hartmann

From Search to Predictions in Tagged Information SpacesChristoph Trattner

pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)Gregor Hagedorn

Techniques of information retrieval Tariq Hassan

NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...National Information Standards Organization (NISO)

2012.10 - Workshop on Semantic Statistics - 1Dr.-Ing. Thomas Hartmann

Precision Journalism by Steve DoigLiliana Bounegru

Semantic technology in nutshell 2013. Semantic! are you a linguist?Heimo Hänninen

euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu

Text miningKoshy Geoji

Publishing and Using Linked Open Data - Day 4Richard Urban

Big Linked Data - Creating Training CurriculaEUCLID project

Final presentationNitish Upreti

Data Management for Graduate StudentsRebekah Cummings

Analysing & Improving Learning Resources Markup on the WebStefan Dietze

What's hot (20)

Lotus: Linked Open Text UnleaShed - ISWC COLD '15

Basic of trees 2

VALA 2016 L-Plate session on Linked Open Data

Linked Data

Linked Data: A short(-ish) introduction

ESWC 2011 - Designing an Ontology for the Data Documentation Initiative

From Search to Predictions in Tagged Information Spaces

pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)

Techniques of information retrieval

NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...

2012.10 - Workshop on Semantic Statistics - 1

Precision Journalism by Steve Doig

Semantic technology in nutshell 2013. Semantic! are you a linguist?

euclid_linkedup WWW tutorial (Besnik Fetahu)

Text mining

Publishing and Using Linked Open Data - Day 4

Big Linked Data - Creating Training Curricula

Final presentation

Data Management for Graduate Students

Analysing & Improving Learning Resources Markup on the Web

Recently uploaded

ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2

Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal

CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807

Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls

Microsoft Copilot AI for Everyone - created by AITatiana Gurgel

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute

Thirunelveli call girls Tamil escorts 7877702510Vipesco

Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22

CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi

Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen

George Lever - eCommerce Day Chile 2024eCommerce Institute

WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal

Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage

Mathematics of Finance Presentation.pptxMoumonDas2

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal

Recently uploaded (20)

ANCHORING SCRIPT FOR A CULTURAL EVENT.docx

Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...

CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...

Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝

BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service

Microsoft Copilot AI for Everyone - created by AI

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...

Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024

Thirunelveli call girls Tamil escorts 7877702510

Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx

CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...

Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...

Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...

George Lever - eCommerce Day Chile 2024

WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )

Introduction to Prompt Engineering (Focusing on ChatGPT)

Mathematics of Finance Presentation.pptx

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy

Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015

1. Using Linked Data Traversal to Label Academic Communities Ilaria Tiddi, Mathieu d’Aquin, Enrico Motta Knowledge Media Institute, The Open University

2. Motivation We • Explain data patterns automatically • Using Linked Data background knowledge Scholarly data • Growing interest and techniques • Mine and visualise data • Reveal hidden knowledge • Forecast Data interpretation still manual

3. Use-case: Community Detection Aim • Detecting communities of research topics • The Open University papers (ORO1) Usual text-mining methods • Groups of similar documents • Probabilistically extracted topics • Based on words of co-occurrence 1http://oro.open.ac.uk/

4. Use-case: Community Detection Problem Labeling require human interpretation

5. Linked Data can help! • Scholarly data: big portion within Linked Data • RDF structure (machine understandable) • Linked datasets • Across disciplines • Easier discovery of unrevealed knowledge • Easier result interpretation

6. Proposition • Automatic topic detection (labels) • With Linked Data background knowledge • Machine Learning approach • A* search over the Linked Data graph • Link traversal (vs. literature based on SPARQL)

7. Approach Document clustering • text pre-processing (normalise, stem, filter) • Latent Semantic Analysis space of word vectors • clustering according to LSA distance • community : a group of similar words Communities networking • connecting clusters’ centroids (the closest one) • network graph of communities

8. Initial dataset • Words URIs • connected to DBpedia Machine Learning/Logic Programming approach • Given • Positive examples E+ : Cluster (words) to label • Negative examples E-: Words not in E+ • Background Knowledge from Linked Data • Derive • Explanations of the grouping for E+ (topic) Approach

9. Explanation • RDF property chains • Leading to the same entity • shared by a subset of initial words Linked Data Background Knowledge Topic: many words of the cluster that share the same explanation

10. Aim: find the explanation shared by the biggest number of words in the cluster Linked Data Traversal e.g. <skos:relatedMatch-dc:subject-skos:broader.db:Creativity>

11. How: A* search to iteratively explore new parts of the graph and improve the explanation Linked Data Traversal <skos:relatedMatch-dc:subject-skos:broader-skos:broader.db:Aesthetics>

12. Ranking explanations according to F-Measure Take the best explanation and label the cluster Explanation Evaluation word outside E+ words sharing the explanation cluster (E+)

13. Community Labeling Examples of topics: <skos:relatedMatch-dc:subject-skos:broader-skos:broader-skos:broader.db:Geology> <skos:relatedMatch-dc:subject-skos:broader-skos:broader-skos:broader.db:Chemistry> <skos:relatedMatch-dc:subject-skos:broader-skos:broader-skos:broader.db:Mathematics>

14. Conclusion and future work Facilitating data interpretation by combining • scholarly data • Machine Learning • Linked Data graph search Future work • improve the graph exploration to discover more knowledge • focus on the definition of “explanation”

15. Thank you! Questions? Many thanks to him and him

Editor's Notes

facilitate the process of understaing
That is advertisement
now that we have the words…use word encoded!
with this in mindexplanations can be many… / not all of the words are in C+
and outside the cluster 2nd other pb: what if there are better explanation after that better represent the cluster?
the amount of items in the cluster to which the explanation applies awa the amount of items outside the cluster one being precision and the latter being recall
say about the iteration improvements

Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015

Similar to Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015 (20)

More from Vrije Universiteit Amsterdam

More from Vrije Universiteit Amsterdam (13)

Recently uploaded

Recently uploaded (20)

Using Linked Data Traversal to Label Academic Communities - SAVE-SD2015

Editor's Notes