SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Entitifying Europeana: Building an
ecosystem of networked references
for Cultural Objects
Hugo Manguinhas, Valentine Charles, Antoine Isaac, Tim Hill|
Europeana Foundation
What is Europeana?
CC BY-SA
We aggregate metadata:
• From all EU countries
• ~3,500 galleries, libraries,
archives and museums
• More than 53M objects
• In about 50 languages
• Huge amount of references to
places, agents, concepts, time
Europeana aggregation infrastructure
Europeana| CC BY-SA
The Platform for Europe’s Digital Cultural Heritage
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Europeana Linked Data Strategy
Our efforts and lines of work
CC BY-SA
• Europeana Data Model (EDM) offers a base for linking data
• We apply automatic enrichment to link source data to reference
data
• We encourage data providers to contribute their own
vocabularies so that we can benefit from data links made at
data providers’ level
• We encourage alignment activities between domain
vocabularies
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Significant progress have been made, most of it
presented in past SWIB!
Europeana Linked Data Strategy
A strategy for Entities
CC BY-SA
As a cornerstone for our strategy we are building an
"Entity Collection"
• A service that acts as a centralized point of reference and
access to data about contextual entities
• Caching and curating data from the wider Linked Open Data
cloud
• A sort of Europeana "knowledge graph"
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Europeana Linked Data Strategy
Motivation
CC BY-SA
• Improve user experience
• Support better ways of searching and navigating through the
collections, eliminating ambiguity and clarifying the meaning of
descriptions
• Adapt better to the language of the user
• by improving the interlinking of data
• Brings more context to the objects
• Alleviates polysemy issues
• Expands language coverage
• Contributes to build a web of data ('knowledge graph') that
third parties can use to improve their users' experience
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
The Entity Collection
Use Cases
CC BY-SA
Europeana Collections Portal
● Findability: users can look for entities, not
only records (Entity-Based Search)
● Understandability: Entity Pages group and
present all assertions about an entity
● Exploration: Navigation along relationships
becomes possible
Crowdsourcing
● Objects can be annotated with references to
entities
● A controlled vocabulary for client applications
Enrichment of Provider’s Data
● A controlled vocabulary to help identify
named references to entities
Republication for Re-use
● Entities can be republished as an open
source to the community
Entity Collection
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
The Entity Collection
What can it enable?
CC BY-SA
Semantic auto-
completion
Semantic and
Metadata annotations
Entity Pages
Entity based facets
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Google Knowledge CardPundit Annotation ClientFood & Drink Project
The Entity Collection
How do we choose our target vocabularies?
CC BY-SA
As defined in the recent Europeana Tech Task Force on enrichment and
evaluation (presented last year), we consider the following criteria when
selecting a vocabulary:
• Properly documented and supported by a community
• Technically available on the web according to the Linked Data best
practices and recipes
• Available under an open licence
• Multilingual
• Abide to a minimal ontological commitment principle
• Apply the best practices and standards for the representation, structure
and description of vocabularies
• Well-connected internally and externally to other vocabularies (preferably
spine vocabularies)
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
The Entity Collection
Which target vocabularies are we using?
CC BY-SA
For historical reasons, the target vocabularies correspond to the ones being
used for Semantic Enrichment (as of November 2016):
• Places
a subset of Geonames, corresponding to places which are part of
European countries and of some specific feature classes.
• Agents
a subset of DBpedia corresponding to most of the instances of dbp:Artist
with some exceptions, and integrated from 49 DBpedia language editions.
• Concepts
a subset of DBpedia corresponding to a handful of concepts matching
the needs from Europeana Collections.
• Time Spans
The chronological periods from SemiumTime.
214,307
resources
274
resources
165,008
resources
2,566
resources
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
The Entity Collection
Contribution to multilingual coverage
CC BY-SA
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Entities effectively used to enrich Europeana Objects
Entities present in the Entity Collection
The Entity Collection
Are these target vocabularies enough?
CC BY-SA
• Not enough coreferencing information to other vocabularies
• particularly to the ones we receive from data providers (e.g.
musical instruments, MIMO)
• Labels and values are not always accurate and normalized
• need for better reference data (e.g. VIAF)
• Missing relevant information
• e.g. roles and professions
• Need to expand coverage to other types of entities
• namely Works and Events
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
The Entity Collection
Challenges
CC BY-SA
Investigate and design strategies for:
• Integrating new vocabularies that can further improve
• entity descriptions and multilingual coverage (e.g. VIAF)
• linking between entities (e.g. Wikidata)
• Integrating alignments, in particular:
• links between local/domain vocabularies to pivot vocabularies
• Supporting manual curation of existing and new entities
• Keeping up-to-date the information collected from external sources
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
The Entity Collection
Our roadmap for the next years
CC BY-SA
• Mint Europeana URIs for Entities and update internal
references
• Make entity services and data available via an API
• Make use of the API in the Collections Portal
• Implement support for new vocabularies and entity types
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
✔
✔
The Entity Collection
Alpha release of our new Entity API
CC BY-SA
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
More methods will come, for:
Creation, Update and Delete; URI resolution to Europeana Entities
The Entity Collection
DBpedia resource for “Mozart” in our data
CC BY-SA
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Coreference links to 6 other
datasets
(e.g. Freebase, Wikidata)
Inter-linking information… still
need to switch references to link
to Europeana Entities
Preferred labels for 48
languages
The Entity Collection
Entity API - suggest method
CC BY-SA
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
/entity/suggest.json?text=neo&lang=en&rows=6
Conclusion
CC BY-SA
• A Strategy for Entities is a “must” for Europeana
• There is no “one fits all” vocabulary
• We have a long way to go…
...but we are making progress
SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
Thank you!
hugo.manguinhas@europeana.eu

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Empfohlen

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Empfohlen (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Entitifying europeana - Building an ecosystem of networked references for cultural objects

  • 1. Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects Hugo Manguinhas, Valentine Charles, Antoine Isaac, Tim Hill| Europeana Foundation
  • 2. What is Europeana? CC BY-SA We aggregate metadata: • From all EU countries • ~3,500 galleries, libraries, archives and museums • More than 53M objects • In about 50 languages • Huge amount of references to places, agents, concepts, time Europeana aggregation infrastructure Europeana| CC BY-SA The Platform for Europe’s Digital Cultural Heritage SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 3. Europeana Linked Data Strategy Our efforts and lines of work CC BY-SA • Europeana Data Model (EDM) offers a base for linking data • We apply automatic enrichment to link source data to reference data • We encourage data providers to contribute their own vocabularies so that we can benefit from data links made at data providers’ level • We encourage alignment activities between domain vocabularies SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects Significant progress have been made, most of it presented in past SWIB!
  • 4. Europeana Linked Data Strategy A strategy for Entities CC BY-SA As a cornerstone for our strategy we are building an "Entity Collection" • A service that acts as a centralized point of reference and access to data about contextual entities • Caching and curating data from the wider Linked Open Data cloud • A sort of Europeana "knowledge graph" SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 5. Europeana Linked Data Strategy Motivation CC BY-SA • Improve user experience • Support better ways of searching and navigating through the collections, eliminating ambiguity and clarifying the meaning of descriptions • Adapt better to the language of the user • by improving the interlinking of data • Brings more context to the objects • Alleviates polysemy issues • Expands language coverage • Contributes to build a web of data ('knowledge graph') that third parties can use to improve their users' experience SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 6. The Entity Collection Use Cases CC BY-SA Europeana Collections Portal ● Findability: users can look for entities, not only records (Entity-Based Search) ● Understandability: Entity Pages group and present all assertions about an entity ● Exploration: Navigation along relationships becomes possible Crowdsourcing ● Objects can be annotated with references to entities ● A controlled vocabulary for client applications Enrichment of Provider’s Data ● A controlled vocabulary to help identify named references to entities Republication for Re-use ● Entities can be republished as an open source to the community Entity Collection SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 7. The Entity Collection What can it enable? CC BY-SA Semantic auto- completion Semantic and Metadata annotations Entity Pages Entity based facets SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects Google Knowledge CardPundit Annotation ClientFood & Drink Project
  • 8. The Entity Collection How do we choose our target vocabularies? CC BY-SA As defined in the recent Europeana Tech Task Force on enrichment and evaluation (presented last year), we consider the following criteria when selecting a vocabulary: • Properly documented and supported by a community • Technically available on the web according to the Linked Data best practices and recipes • Available under an open licence • Multilingual • Abide to a minimal ontological commitment principle • Apply the best practices and standards for the representation, structure and description of vocabularies • Well-connected internally and externally to other vocabularies (preferably spine vocabularies) SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 9. The Entity Collection Which target vocabularies are we using? CC BY-SA For historical reasons, the target vocabularies correspond to the ones being used for Semantic Enrichment (as of November 2016): • Places a subset of Geonames, corresponding to places which are part of European countries and of some specific feature classes. • Agents a subset of DBpedia corresponding to most of the instances of dbp:Artist with some exceptions, and integrated from 49 DBpedia language editions. • Concepts a subset of DBpedia corresponding to a handful of concepts matching the needs from Europeana Collections. • Time Spans The chronological periods from SemiumTime. 214,307 resources 274 resources 165,008 resources 2,566 resources SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 10. The Entity Collection Contribution to multilingual coverage CC BY-SA SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects Entities effectively used to enrich Europeana Objects Entities present in the Entity Collection
  • 11. The Entity Collection Are these target vocabularies enough? CC BY-SA • Not enough coreferencing information to other vocabularies • particularly to the ones we receive from data providers (e.g. musical instruments, MIMO) • Labels and values are not always accurate and normalized • need for better reference data (e.g. VIAF) • Missing relevant information • e.g. roles and professions • Need to expand coverage to other types of entities • namely Works and Events SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 12. The Entity Collection Challenges CC BY-SA Investigate and design strategies for: • Integrating new vocabularies that can further improve • entity descriptions and multilingual coverage (e.g. VIAF) • linking between entities (e.g. Wikidata) • Integrating alignments, in particular: • links between local/domain vocabularies to pivot vocabularies • Supporting manual curation of existing and new entities • Keeping up-to-date the information collected from external sources SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects
  • 13. The Entity Collection Our roadmap for the next years CC BY-SA • Mint Europeana URIs for Entities and update internal references • Make entity services and data available via an API • Make use of the API in the Collections Portal • Implement support for new vocabularies and entity types SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects ✔ ✔
  • 14. The Entity Collection Alpha release of our new Entity API CC BY-SA SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects More methods will come, for: Creation, Update and Delete; URI resolution to Europeana Entities
  • 15. The Entity Collection DBpedia resource for “Mozart” in our data CC BY-SA SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects Coreference links to 6 other datasets (e.g. Freebase, Wikidata) Inter-linking information… still need to switch references to link to Europeana Entities Preferred labels for 48 languages
  • 16. The Entity Collection Entity API - suggest method CC BY-SA SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects /entity/suggest.json?text=neo&lang=en&rows=6
  • 17. Conclusion CC BY-SA • A Strategy for Entities is a “must” for Europeana • There is no “one fits all” vocabulary • We have a long way to go… ...but we are making progress SWIB16 - Entitifying Europeana: Building an ecosystem of networked references for Cultural Objects