SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Evaluation of metadata
enrichment practices in digital
libraries: steps towards better
data enrichments
Valentine Charles and Juliane Stiller | SWIB 2015
Semantic enrichment, a solution for better
quality data?
Automatic and manual enrichment are more and more commonly used in digital
libraries to:
• “standardize data” by linking it to authority resources
• improve multilingual coverage in datasets
• contextualise resource
• And more
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
CC BY-SA
Semantic enrichment, a solution for better
quality data?
Automatic and manual enrichment are more and more commonly used in digital
libraries to:
• “standardize data” by linking it to authority resources
• improve multilingual coverage in datasets
• contextualise resource
• And more
This growth is also accompanied by the adoption of semantic web technology
in the cultural heritage!
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
CC BY-SA
What do we mean by semantic enrichment ?
(Semantic) enrichment can be described:
• As a process that improves metadata about an object by adding statements
about the object that this metadata describes
• As a process e.g. the application of an enrichment tool
• As a result e.g. the new metadata created at the end of the process
Enrichment strategy refers to all workflows components and the processes which
determined these components.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
What do we mean by semantic enrichment ?
(Semantic) enrichment can be described:
• As a process that improves metadata about an object by adding statements
about the object that this metadata describes
• As a process e.g. the application of an enrichment tool
• As a result e.g. the new metadata created at the end of the process
Enrichment strategy refers to all workflows components and the processes which
determined these components.
Co-referencing, alignment, contextualisation, annotation are types of
enrichment !
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Europeana Semantic Enrichment - Example
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
An incorrect
enrichment is worse
than no enrichment
at all !
France, Public Domain
1921, National Library of France
Agence de presse Meurisse
Colombes : championnats de France d’Athlétisme :

rivière, le speaker
Our starting point
Incorrect enrichments lead to
• Devaluation of curated metadata
• Loss of trust from providers
• Propagation of errors to different languages
• Irrelevant search results
• Bad user experiences
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Our starting point
Incorrect enrichments lead to
• Devaluation of curated metadata
• Loss of trust from providers
• Propagation of errors to different languages
• Irrelevant search results
• Bad user experiences
CC BY-SA
Better understanding of impact of enrichments is needed !
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Comparative evaluation
A group of 23 experts in the Cultural Heritage domain run a quantitative and
qualitative evaluation:
• An evaluation dataset contributed by The European Library
• Enriched by 7 different tools or tool settings
• 1757 enrichments were sampled and manually annotated
• Goal was to determine the quality of the enrichments
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Comparative evaluation
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Frequency of enriched records for
each enrichment tool.
Number of enrichments by
enrichment tool.
10 recommendations
for better semantic
enrichment services
Netherlands, Public Domain
1644, Rijksmuseum
Adriaen van Utrecht
Banquet Still Life
Define your enrichment goals
What problems are you trying to solve?
What type of enrichment do you want to achieve?
On which criteria will you evaluate the success of the enrichment ?
What type of data do you want as input or output?
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
1
Define your enrichment goals
What problems are you trying to solve?
What type of enrichment do you want to achieve?
On which criteria will you evaluate the success of the enrichment ?
What type of data do you want as input or output?
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Clear goals will help you to define your gold standard and guidelines for
evaluating your enrichment
1
Choose the right method…
Choose enrichment methods and techniques that are adapted to the goals
• For instance a service providing an interface might be better if the
enrichment is manual
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
2
http://
cultuurlink.beeldengeluid.nl/
app/#/
…and the right service
Select a service that takes into account the semantics of your input data
• enrichment service or tool recognizes a specific metadata schema
(schema/semantics-aware).
Select a service that provides output data in the desired representations
• Enrichment services should provide appropriate representations of
enrichments following similar patterns and standard.
• They should also publish metadata for enrichment (e.g. provenance,
confidence)
The selection of the target dataset that the enrichment service will use is also
an important criteria for obtaining good quality enrichment.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
2
Define the enrichment workflow
The workflow(s) supported by an enrichment service should be
clearly identifiable so that an enricher:
• Can act upon the different components of the enrichment
(source, target, and application scenario)
• Knows also the order in which to activate the steps.
Rules for the enrichment need to be defined and documented.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
3
Define the enrichment workflow
The workflow(s) supported by an enrichment service should be
clearly identifiable so that an enricher:
• Can act upon the different components of the enrichment
(source, target, and application scenario)
• Knows also the order in which to activate the steps.
Rules for the enrichment need to be defined and documented.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Don’ t forget user-initiated enrichment workflows !
3
Train the enricher
The enricher should have sufficient knowledge of the enrichment
tool and on the parts of the data affected by the enrichment service
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
4
Train the enricher
The enricher should have sufficient knowledge of the enrichment
tool and on the parts of the data affected by the enrichment service
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
This knowledge is crucial for the assessments of the enrichments
results and the potential enrichment errors which might occur !
4
Test your enrichment workflow
Your enrichment service should be flexible enough to have
adjustments made.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
5
Assess the quality of your enrichment
The quality needs to be assessed against the goals set for the
enrichment.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
6
Assess the quality of your enrichment
The quality needs to be assessed against the goals set for the
enrichment.
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Only if you know what the best possible outcome of the process is,
you can evaluate how well the process worked !
6
Have an evaluation strategy and choose
the right evaluation method
Assessing the quality of your enrichment will require an evaluation strategy
taking into account:
• the correctness of the enrichments (intrinsic evaluation)
• Factors such as the search performance, user-experience and relevance
(extrinsic evaluation)
Creation of Gold standards or annotated samples can be a good strategy but
• Make sure the diversity of the dataset represent the diversity of the data
• Make clear guidelines for raters and train them
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
7
Have an evaluation strategy and choose
the right evaluation method
Assessing the quality of your enrichment will require an evaluation strategy
taking into account:
• the correctness of the enrichments (intrinsic evaluation)
• Factors such as the search performance, user-experience and relevance
(extrinsic evaluation)
Creation of Gold standards or annotated samples can be a good strategy but
• Make sure the diversity of the dataset represent the diversity of the data
• Make clear guidelines for raters and train them
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Using a collaborative tool or computing inter-rater agreements can
help the validation process.
7
Apply user-initiated enrichment workflows
Don’ t forget user-initiated enrichment workflows !
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
8
Apply user-initiated enrichment workflows
Don’ t forget user-initiated enrichment workflows !
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
8
Monitor your enrichment process and re-
assess
An enrichment process needs constant re-evaluation
• Source data might have expanded with more information
• The target dataset might be richer (new terms, new languages, more
granular)
• The workflow might have been improved
CC BY-SA
Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
Document your enrichment process and
learnings

Documentation will allow the improvements of future enrichment
strategies and the maintenance of an enrichment process
9
10
All the details in the report at
http://bit.ly/Evaluation-Enrichment
Have a good read !
valentine.charles@europeana.eu
Netherlands, CC BY-SA
Anonymous
Troupe Banola leur dernière création
1911, Circus Museum

Weitere ähnliche Inhalte

Ähnlich wie Evaluation of metadata enrichment practices in digital libraries:steps towards better data enrichments

Transforming-UX-with-Content-Strategy
Transforming-UX-with-Content-StrategyTransforming-UX-with-Content-Strategy
Transforming-UX-with-Content-Strategy
Brenda Lee Intengan
 
Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016
Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016
Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016
Talis
 

Ähnlich wie Evaluation of metadata enrichment practices in digital libraries:steps towards better data enrichments (20)

PPM Presentation- Enterprise SharePoint Program - Innovate Vancouver.pdf
PPM Presentation- Enterprise SharePoint Program - Innovate Vancouver.pdfPPM Presentation- Enterprise SharePoint Program - Innovate Vancouver.pdf
PPM Presentation- Enterprise SharePoint Program - Innovate Vancouver.pdf
 
Level Up Web: Modern Web Development and Management Practices for Libraries
Level Up Web: Modern Web Development and Management Practices for LibrariesLevel Up Web: Modern Web Development and Management Practices for Libraries
Level Up Web: Modern Web Development and Management Practices for Libraries
 
A Business-first Approach to Building Data Governance Program
A Business-first Approach to Building Data Governance ProgramA Business-first Approach to Building Data Governance Program
A Business-first Approach to Building Data Governance Program
 
Edge Benchmarks
Edge BenchmarksEdge Benchmarks
Edge Benchmarks
 
More Content, More Channels, More Chill: Your 2022 Self-Care Guide to Content...
More Content, More Channels, More Chill: Your 2022 Self-Care Guide to Content...More Content, More Channels, More Chill: Your 2022 Self-Care Guide to Content...
More Content, More Channels, More Chill: Your 2022 Self-Care Guide to Content...
 
Content Strategy and Product Management (in science education)
Content Strategy and Product Management (in science education)Content Strategy and Product Management (in science education)
Content Strategy and Product Management (in science education)
 
Getting started with Content Strategy / Michele-Ann Jenkins
Getting started with Content Strategy / Michele-Ann JenkinsGetting started with Content Strategy / Michele-Ann Jenkins
Getting started with Content Strategy / Michele-Ann Jenkins
 
Collections Trust Seminar - October 2014
Collections Trust Seminar - October 2014Collections Trust Seminar - October 2014
Collections Trust Seminar - October 2014
 
Digital-Warriors-Marketing Roadmap with Big Data Analytics
Digital-Warriors-Marketing Roadmap with Big Data AnalyticsDigital-Warriors-Marketing Roadmap with Big Data Analytics
Digital-Warriors-Marketing Roadmap with Big Data Analytics
 
Conquering Your Content Management Challenges - DX Summit 2018
Conquering Your Content Management Challenges - DX Summit 2018Conquering Your Content Management Challenges - DX Summit 2018
Conquering Your Content Management Challenges - DX Summit 2018
 
Improving the Supply Chain Cycle Through E-commerce
Improving the Supply Chain Cycle Through E-commerceImproving the Supply Chain Cycle Through E-commerce
Improving the Supply Chain Cycle Through E-commerce
 
Transforming-UX-with-Content-Strategy
Transforming-UX-with-Content-StrategyTransforming-UX-with-Content-Strategy
Transforming-UX-with-Content-Strategy
 
Insight Hub
Insight HubInsight Hub
Insight Hub
 
Context Matters - Shifting from Reporting to Analysis
Context Matters - Shifting from Reporting to AnalysisContext Matters - Shifting from Reporting to Analysis
Context Matters - Shifting from Reporting to Analysis
 
Whitepaper - An Introductory Guide to Maximizing Channel Automation by Cnet ...
Whitepaper  - An Introductory Guide to Maximizing Channel Automation by Cnet ...Whitepaper  - An Introductory Guide to Maximizing Channel Automation by Cnet ...
Whitepaper - An Introductory Guide to Maximizing Channel Automation by Cnet ...
 
Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016
Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016
Talis Aspire Update - Keji Adedeji | Talis Insight Europe 2016
 
Introduction to Content Inventories and Audits
Introduction to Content Inventories and AuditsIntroduction to Content Inventories and Audits
Introduction to Content Inventories and Audits
 
All presentation SharePoint O365 and everything else
All presentation SharePoint O365 and everything else All presentation SharePoint O365 and everything else
All presentation SharePoint O365 and everything else
 
NISO Open Discovery Initiative January 2019
NISO Open Discovery Initiative January 2019NISO Open Discovery Initiative January 2019
NISO Open Discovery Initiative January 2019
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my view
 

Mehr von Valentine Charles

Mehr von Valentine Charles (12)

Data quality in cultural heritage (meta)data
Data quality in cultural heritage (meta)dataData quality in cultural heritage (meta)data
Data quality in cultural heritage (meta)data
 
Building a Framework for Semantic Cultural Heritage Data
Building a Framework for Semantic Cultural Heritage DataBuilding a Framework for Semantic Cultural Heritage Data
Building a Framework for Semantic Cultural Heritage Data
 
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...When Semantics support Multilingual Access to Digital Cultural Heritage - the...
When Semantics support Multilingual Access to Digital Cultural Heritage - the...
 
Linked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachLinked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approach
 
Europeana 1914-1918, User-Generated Content and Linked Open Data
Europeana 1914-1918, User-Generated Content and Linked Open DataEuropeana 1914-1918, User-Generated Content and Linked Open Data
Europeana 1914-1918, User-Generated Content and Linked Open Data
 
Achieving interoperability between CARARE schema for monuments and sites and ...
Achieving interoperability between CARARE schema for monuments and sites and ...Achieving interoperability between CARARE schema for monuments and sites and ...
Achieving interoperability between CARARE schema for monuments and sites and ...
 
Defining collections and creating their descriptions
Defining collections and creating their descriptionsDefining collections and creating their descriptions
Defining collections and creating their descriptions
 
Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model   Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model
 
Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...Links, languages and semantics: linked data approaches in The European Libra...
Links, languages and semantics: linked data approaches in The European Libra...
 
Discovering libraries's gold through collection-level descriptions
Discovering libraries's gold through collection-level descriptionsDiscovering libraries's gold through collection-level descriptions
Discovering libraries's gold through collection-level descriptions
 
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
Mapping cross-­domain metadata to the Europeana Data Model (EDM) - EDM introd...
 
The GLAMwiki toolset project
The GLAMwiki toolset projectThe GLAMwiki toolset project
The GLAMwiki toolset project
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Evaluation of metadata enrichment practices in digital libraries:steps towards better data enrichments

  • 1. Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Valentine Charles and Juliane Stiller | SWIB 2015
  • 2. Semantic enrichment, a solution for better quality data? Automatic and manual enrichment are more and more commonly used in digital libraries to: • “standardize data” by linking it to authority resources • improve multilingual coverage in datasets • contextualise resource • And more Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments CC BY-SA
  • 3. Semantic enrichment, a solution for better quality data? Automatic and manual enrichment are more and more commonly used in digital libraries to: • “standardize data” by linking it to authority resources • improve multilingual coverage in datasets • contextualise resource • And more This growth is also accompanied by the adoption of semantic web technology in the cultural heritage! Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments CC BY-SA
  • 4. What do we mean by semantic enrichment ? (Semantic) enrichment can be described: • As a process that improves metadata about an object by adding statements about the object that this metadata describes • As a process e.g. the application of an enrichment tool • As a result e.g. the new metadata created at the end of the process Enrichment strategy refers to all workflows components and the processes which determined these components. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
  • 5. What do we mean by semantic enrichment ? (Semantic) enrichment can be described: • As a process that improves metadata about an object by adding statements about the object that this metadata describes • As a process e.g. the application of an enrichment tool • As a result e.g. the new metadata created at the end of the process Enrichment strategy refers to all workflows components and the processes which determined these components. Co-referencing, alignment, contextualisation, annotation are types of enrichment ! CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
  • 6. Europeana Semantic Enrichment - Example CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
  • 7. An incorrect enrichment is worse than no enrichment at all ! France, Public Domain 1921, National Library of France Agence de presse Meurisse Colombes : championnats de France d’Athlétisme :
 rivière, le speaker
  • 8. Our starting point Incorrect enrichments lead to • Devaluation of curated metadata • Loss of trust from providers • Propagation of errors to different languages • Irrelevant search results • Bad user experiences CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
  • 9. Our starting point Incorrect enrichments lead to • Devaluation of curated metadata • Loss of trust from providers • Propagation of errors to different languages • Irrelevant search results • Bad user experiences CC BY-SA Better understanding of impact of enrichments is needed ! Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
  • 10. Comparative evaluation A group of 23 experts in the Cultural Heritage domain run a quantitative and qualitative evaluation: • An evaluation dataset contributed by The European Library • Enriched by 7 different tools or tool settings • 1757 enrichments were sampled and manually annotated • Goal was to determine the quality of the enrichments CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments
  • 11. Comparative evaluation CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Frequency of enriched records for each enrichment tool. Number of enrichments by enrichment tool.
  • 12. 10 recommendations for better semantic enrichment services Netherlands, Public Domain 1644, Rijksmuseum Adriaen van Utrecht Banquet Still Life
  • 13. Define your enrichment goals What problems are you trying to solve? What type of enrichment do you want to achieve? On which criteria will you evaluate the success of the enrichment ? What type of data do you want as input or output? CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 1
  • 14. Define your enrichment goals What problems are you trying to solve? What type of enrichment do you want to achieve? On which criteria will you evaluate the success of the enrichment ? What type of data do you want as input or output? CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Clear goals will help you to define your gold standard and guidelines for evaluating your enrichment 1
  • 15. Choose the right method… Choose enrichment methods and techniques that are adapted to the goals • For instance a service providing an interface might be better if the enrichment is manual CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 2 http:// cultuurlink.beeldengeluid.nl/ app/#/
  • 16. …and the right service Select a service that takes into account the semantics of your input data • enrichment service or tool recognizes a specific metadata schema (schema/semantics-aware). Select a service that provides output data in the desired representations • Enrichment services should provide appropriate representations of enrichments following similar patterns and standard. • They should also publish metadata for enrichment (e.g. provenance, confidence) The selection of the target dataset that the enrichment service will use is also an important criteria for obtaining good quality enrichment. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 2
  • 17. Define the enrichment workflow The workflow(s) supported by an enrichment service should be clearly identifiable so that an enricher: • Can act upon the different components of the enrichment (source, target, and application scenario) • Knows also the order in which to activate the steps. Rules for the enrichment need to be defined and documented. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 3
  • 18. Define the enrichment workflow The workflow(s) supported by an enrichment service should be clearly identifiable so that an enricher: • Can act upon the different components of the enrichment (source, target, and application scenario) • Knows also the order in which to activate the steps. Rules for the enrichment need to be defined and documented. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Don’ t forget user-initiated enrichment workflows ! 3
  • 19. Train the enricher The enricher should have sufficient knowledge of the enrichment tool and on the parts of the data affected by the enrichment service CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 4
  • 20. Train the enricher The enricher should have sufficient knowledge of the enrichment tool and on the parts of the data affected by the enrichment service CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments This knowledge is crucial for the assessments of the enrichments results and the potential enrichment errors which might occur ! 4
  • 21. Test your enrichment workflow Your enrichment service should be flexible enough to have adjustments made. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 5
  • 22. Assess the quality of your enrichment The quality needs to be assessed against the goals set for the enrichment. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 6
  • 23. Assess the quality of your enrichment The quality needs to be assessed against the goals set for the enrichment. CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Only if you know what the best possible outcome of the process is, you can evaluate how well the process worked ! 6
  • 24. Have an evaluation strategy and choose the right evaluation method Assessing the quality of your enrichment will require an evaluation strategy taking into account: • the correctness of the enrichments (intrinsic evaluation) • Factors such as the search performance, user-experience and relevance (extrinsic evaluation) Creation of Gold standards or annotated samples can be a good strategy but • Make sure the diversity of the dataset represent the diversity of the data • Make clear guidelines for raters and train them CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 7
  • 25. Have an evaluation strategy and choose the right evaluation method Assessing the quality of your enrichment will require an evaluation strategy taking into account: • the correctness of the enrichments (intrinsic evaluation) • Factors such as the search performance, user-experience and relevance (extrinsic evaluation) Creation of Gold standards or annotated samples can be a good strategy but • Make sure the diversity of the dataset represent the diversity of the data • Make clear guidelines for raters and train them CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Using a collaborative tool or computing inter-rater agreements can help the validation process. 7
  • 26. Apply user-initiated enrichment workflows Don’ t forget user-initiated enrichment workflows ! CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 8
  • 27. Apply user-initiated enrichment workflows Don’ t forget user-initiated enrichment workflows ! CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments 8
  • 28. Monitor your enrichment process and re- assess An enrichment process needs constant re-evaluation • Source data might have expanded with more information • The target dataset might be richer (new terms, new languages, more granular) • The workflow might have been improved CC BY-SA Evaluation of metadata enrichment practices in digital libraries: steps towards better data enrichments Document your enrichment process and learnings
 Documentation will allow the improvements of future enrichment strategies and the maintenance of an enrichment process 9 10
  • 29. All the details in the report at http://bit.ly/Evaluation-Enrichment Have a good read ! valentine.charles@europeana.eu Netherlands, CC BY-SA Anonymous Troupe Banola leur dernière création 1911, Circus Museum