SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
PATHSenrich: A Web Service Prototype
for Automatic Cultural Heritage Item
Enrichment
Eneko Agirre, Ander Barrena, Kike Fernandez, Esther Miranda,
Arantxa Otegi, and Aitor Soroa
IXA NLP Group, University of the Basque Country UPV/EHU
arantza.otegi@ehu.es
Abstract. Large amounts of cultural heritage material are nowadays
available through online digital library portals. Most of these cultural
items have short descriptions and lack rich contextual information. The
PATHS project has developed experimental enrichment services. As a
proof of concept, this paper presents a web service prototype which allows
independent content providers to enrich cultural heritage items with a
subset of the full functionality: links to related items in the collection
and links to related Wikipedia articles. In the future we plan to provide
more advanced functionality, as available offline for PATHS.

1

Introduction

Large amounts of cultural heritage (CH) material are now available through
online digital library portals, such as Europeana1. Europeana hosts millions of
books, paintings, films, museum objects and archival records that have been digitised throughout Europe. Europeana collects contextual information or metadata
about different types of content, which the users can use for their searches.
The main strength of Europeana lays in the vast number of items it contains.
Sometimes, though, this quantity comes at the cost of a restricted amount of
metadata, with many items having very short descriptions and a lack of rich
contextual information. One of the goals of the PATHS project2 is precisely to
enrich CH items, using a selected subset of Europeana as a testbed[1].
Whithin the project, this enrichment will make possible to create a system
that acts as an interactive personalised tour guide through Europeana collections, offering suggestions about items to look at and assist in their interpretation by providing relevant contextual information from related items within
Europeana and items from external sources like Wikipedia. Users of such digital
libraries may require information for purposes such as learning and seeking answers to questions. This additional information supports users in fulfilling their
information need, as the evaluation of the first PATHS prototype shows [2].
In this paper we present a web service prototype which allows independent
content providers to enrich CH items. Specifically, the service enriches the items
1
2

http://www.europeana.eu/portal/
http://www.paths-project.eu

T. Aalberg et al. (Eds.): TPDL 2013, LNCS 8092, pp. 462–465, 2013.
c Springer-Verlag Berlin Heidelberg 2013
PATHSenrich: A Web Service Prototype for Automatic CH Item Enrichment

463

with two types of information. On the one hand, the item will be linked to
similar items within the collection. On the other hand, the item will be linked
to Wikipedia articles which are related to it.
There have been many attempts to automatically enrich cultural heritage
metadata. Some projects (for instance, MIMO-DB3 or MERLIN4 ) relate CH
objects with terms of an external authority or vocabulary. Some others (like
MACE5 or YUMA 6 ) adopt a collaborative annotation paradigm for metadata
enrichment. To our knowledge, PATHS is the first project using semantic NLP
processing to link CH items to similar items or external Wikipedia articles.
The current service has limited bandwidth, and provides a selected subset
of the enrichment functionality available internally in the PATHS project. The
quality of the links produce is also slightly lower, although we plan to improve it
in the short future. However, we think that the prototype is useful to demonstrate
the potential to construct a web service for automatically enriching CH items
with high quality information.

2

Demo Description

The web service takes as input one CH item represented following the Europeana
Data Model (EDM) in JSON format, as exported by the Europeana API v2.07 (a
sample record is provided in the interface). The web service returns the following:
– A list of 10 closely related items within the collection.
– A list of Wikipedia pages which are related to the target item.
Figure 1 shows a snapshot of the web service. The service is publicly accessible
following the URL http://ixa2.si.ehu.es/paths_wp2/paths_wp2.pl.
The enrichment is performed by analyzing the metadata associated with the
item, i.e., the title of the item, its description, etc. The next sections briefly
describe how this enrichment is performed.
2.1

Related Items within the Collection

The list of related items is obtained by first creating a query with the content
of the title, subject and description fields (stopwords are removed). The query
is then posted to a SOLR search engine8 . The SOLR search engine accesses an
index created with the subset of Europeana items already enriched offline within
the PATHS project. In that way, the most related Europeana items in the subset
are obtained, and the identifiers of those related items are listed. Note that the
related items used internally in the PATHS project are produced using more
sophisticated methods. Please refer to [1] for further details.
3
4
5
6
7
8

http://www.mimo-international.com
http://www.ucl.ac.uk/ls/merlin
http://www.mace-project.eu
http://dme.ait.ac.at/annotation
http://preview.europeana.eu/portal/api-introduction.html
http://lucene.apache.org/solr/
464

E. Agirre et al.

Fig. 1. Web service interface. It consists of a text area to introduce the input item
in JSON format (top). The “Get EDM JSON example” button can be used to get an
input example. Once a JSON record is typed, click “Process” button to get the output.
The output (bottom) consists on a list of related items and background links.

2.2

Related Wikipedia Articles

For linking the items to Wikipedia articles we follow an implementation similar
to the method described in [3]. This method creates a dictionary, an association
between string mentions with all possible articles the mention can refer to. Our
dictionary is constructed using the title of the Wikipedia article, the redirect
pages, the disambiguation pages and the anchor texts from Wikipedia links.
Mentions are lower-cased and all text between parenthesis is removed. If the
mention links to a disambiguation page, it is associated with all possible articles
the disambiguation page points to. Besides, each association between a mention
and article is scored with the prior probability, estimated as the number of
times that the mention occurs in the anchor text of an article. Note that such
dictionaries can disambiguate any mention, just returning the highest-scoring
article for this particular mention.
Once the dictionary is built, the web service analyzes the title, subject and
description fields of the CH item and matches the longest substring within those
fields with entries in the dictionary. When a match is found, the Wikipedia article
with highest score for this entry is returned. Note that the links to Wikipedia
in the PATHS project are produced using more sophisticated methods. Please
refer to [1] for further details.
PATHSenrich: A Web Service Prototype for Automatic CH Item Enrichment

3

465

Conclusions and Future Work

This paper presents a web service prototype which automatically enriches CH
items with metadata. The web service is inspired in the enrichment work carried
out in the PATHS project, but, contrary to the batch methodology used in the
project, this enrichment is performed online. The prototype has been designed
for demonstration purposes, to showcase the feasibility of providing full-fledged
automatic enrichment.
Our plans for the future include moving the offline enrichment services which
are currently being evaluated in the PATHS project to the web service. In the
case of related Wikipedia articles, we will take into account the context of the
matched entities, which improves the quality of the links [4], and we will include
a filtering algorithm to discard entities that are not relevant. Regarding related
items, we will classify them according to the type of relation [5]. In addition we
plan to automatically organize the items hierarchically, according to a Wikipediabased vocabulary [6].
Acknowledgements. The research leading to these results was carried out as
part of the PATHS project (http://www.paths-project.eu) funded by European Communitys Seventh Framework Programme (FP7/2007- 2013) under
grant agreement no. 270082. The work has been also funded by the Basque
Government (project IBILBIDE, SAIOTEK S-PE12UN089).

References
1. Otegi, A., Agirre, E., Soroa, A., Aletras, N., Chandrinos, C., Fernando, S., GonzalezAgirre, A.: Report accompanying D2.2: Processing and Representation of Content
for Second Prototype. PATHS Project Deliverable (2012),
http://www.paths-project.eu/eng/content/download/2489/18113/version/2/
file/D2.2.Content+Processing-2nd+Prototype-revised.v2.pdf
2. Griffiths, J., Goodale, P., Minelli, S., de Polo, A., Agerri, R., Soroa, A., Hall, M.,
Bergheim, S.R., Chandrinos, K., Chryssochoidis, G., Fernie, K., Usher, T.: D5.1:
Evaluation of the first PATHS prototype. PATHS Project Deliverable (2012),
http://www.paths-project.eu/eng/Resources/
D5.1-Evaluation-of-the-1st-PATHS-Prototype
3. Chang, A.X., Spitkovsky, V.I., Yeh, E., Agirre, E., Manning, C.D.: Stanford-UBC
entity linking at TAC-KBP. In: Proceedings of TAC 2010, Gaithersburg, Maryland,
USA (2010)
4. Han, X., Sun, L.: A Generative Entity-Mention Model for Linking Entities with
Knowledge Base. In: Proceedings of the ACL, Portland, Oregon, USA (2011)
5. Agirre, E., Aletras, N., Gonzalez-Agirre, A., Rigau, G., Stevenson, M.: UBC UOSTYPED: Regression for typed-similarity. In: Second Joint Conference on Lexical
and Computational Semantics (*SEM), Atlanta, Georgia, USA (2013)
6. Fernando, S., Hall, M., Agirre, E., Soroa, A., Clough, P., Stevenson, M.: Comparing Taxonomies for Organising Collections of Documents. In: Proceedings of
COLING 2012, Mumbai, India (2013)

Weitere ähnliche Inhalte

Was ist angesagt?

Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCSCJournals
 
Object models and object representation
Object models and object representationObject models and object representation
Object models and object representationJulie Allinson
 
Annotations chicago
Annotations chicagoAnnotations chicago
Annotations chicagoTimothy Cole
 
Linked Data as a new environment for Learning Analytics and education
Linked Data as a new environment  for Learning Analytics and educationLinked Data as a new environment  for Learning Analytics and education
Linked Data as a new environment for Learning Analytics and educationMathieu d'Aquin
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)Getaneh Alemu
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in LibrariesCarl Hess
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
Metadata enriching and filtering for enhanced collection discoverability
Metadata enriching and filtering for enhanced collection discoverability  Metadata enriching and filtering for enhanced collection discoverability
Metadata enriching and filtering for enhanced collection discoverability Getaneh Alemu
 
Big Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaBig Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaEUCLID project
 
CS6010 Social Network Analysis Unit II
CS6010 Social Network Analysis   Unit IICS6010 Social Network Analysis   Unit II
CS6010 Social Network Analysis Unit IIpkaviya
 
Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library Getaneh Alemu
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities Getaneh Alemu
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
Linked Data for African Libraries
Linked Data for African LibrariesLinked Data for African Libraries
Linked Data for African LibrariesGetaneh Alemu
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 

Was ist angesagt? (20)

Cluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector MachineCluster Based Web Search Using Support Vector Machine
Cluster Based Web Search Using Support Vector Machine
 
Ji cv6n2
Ji cv6n2Ji cv6n2
Ji cv6n2
 
Object models and object representation
Object models and object representationObject models and object representation
Object models and object representation
 
Annotations chicago
Annotations chicagoAnnotations chicago
Annotations chicago
 
Linked Data as a new environment for Learning Analytics and education
Linked Data as a new environment  for Learning Analytics and educationLinked Data as a new environment  for Learning Analytics and education
Linked Data as a new environment for Learning Analytics and education
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
LIBRIS - Linked Library Data
LIBRIS - Linked Library DataLIBRIS - Linked Library Data
LIBRIS - Linked Library Data
 
Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)Current metadata landscape in the library world (Getaneh Alemu)
Current metadata landscape in the library world (Getaneh Alemu)
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in Libraries
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Metadata enriching and filtering for enhanced collection discoverability
Metadata enriching and filtering for enhanced collection discoverability  Metadata enriching and filtering for enhanced collection discoverability
Metadata enriching and filtering for enhanced collection discoverability
 
Big Linked Data - Creating Training Curricula
Big Linked Data - Creating Training CurriculaBig Linked Data - Creating Training Curricula
Big Linked Data - Creating Training Curricula
 
CS6010 Social Network Analysis Unit II
CS6010 Social Network Analysis   Unit IICS6010 Social Network Analysis   Unit II
CS6010 Social Network Analysis Unit II
 
Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
Linked library data
Linked library dataLinked library data
Linked library data
 
Linked Data for African Libraries
Linked Data for African LibrariesLinked Data for African Libraries
Linked Data for African Libraries
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 

Andere mochten auch

Величко М.В. (2014.02.26) — О Майдане и перспективах Украины и России
Величко М.В. (2014.02.26) — О Майдане и перспективах Украины и РоссииВеличко М.В. (2014.02.26) — О Майдане и перспективах Украины и России
Величко М.В. (2014.02.26) — О Майдане и перспективах Украины и Россииmediamera
 
презентация:)
презентация:)презентация:)
презентация:)ILgizmironov
 
Exchange In-Place eDiscovery & Hold | Introduction | 5#7
Exchange In-Place eDiscovery & Hold | Introduction  | 5#7Exchange In-Place eDiscovery & Hold | Introduction  | 5#7
Exchange In-Place eDiscovery & Hold | Introduction | 5#7Eyal Doron
 
The old exchange environment versus modern exchange environment part 02#36
The old exchange environment versus modern exchange environment  part 02#36The old exchange environment versus modern exchange environment  part 02#36
The old exchange environment versus modern exchange environment part 02#36Eyal Doron
 
My E-mail appears as spam - Troubleshooting path | Part 11#17
My E-mail appears as spam - Troubleshooting path | Part 11#17My E-mail appears as spam - Troubleshooting path | Part 11#17
My E-mail appears as spam - Troubleshooting path | Part 11#17Eyal Doron
 
Feg chapter 04 - present perfect azar
Feg chapter 04 - present perfect azarFeg chapter 04 - present perfect azar
Feg chapter 04 - present perfect azarmacbridesmith
 
How does sender verification work how we identify spoof mail) spf, dkim dmar...
How does sender verification work  how we identify spoof mail) spf, dkim dmar...How does sender verification work  how we identify spoof mail) spf, dkim dmar...
How does sender verification work how we identify spoof mail) spf, dkim dmar...Eyal Doron
 
IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...
IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...
IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...designforchangechallenge
 
IND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic Rules
IND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic RulesIND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic Rules
IND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic Rulesdesignforchangechallenge
 

Andere mochten auch (11)

Величко М.В. (2014.02.26) — О Майдане и перспективах Украины и России
Величко М.В. (2014.02.26) — О Майдане и перспективах Украины и РоссииВеличко М.В. (2014.02.26) — О Майдане и перспективах Украины и России
Величко М.В. (2014.02.26) — О Майдане и перспективах Украины и России
 
TAM-2012-07 R C PS Malayakulam -
TAM-2012-07 R C PS Malayakulam -TAM-2012-07 R C PS Malayakulam -
TAM-2012-07 R C PS Malayakulam -
 
GUJ-2012-12 Fazalpur Prathmik Shala No 1
GUJ-2012-12 Fazalpur Prathmik Shala No 1 GUJ-2012-12 Fazalpur Prathmik Shala No 1
GUJ-2012-12 Fazalpur Prathmik Shala No 1
 
презентация:)
презентация:)презентация:)
презентация:)
 
Exchange In-Place eDiscovery & Hold | Introduction | 5#7
Exchange In-Place eDiscovery & Hold | Introduction  | 5#7Exchange In-Place eDiscovery & Hold | Introduction  | 5#7
Exchange In-Place eDiscovery & Hold | Introduction | 5#7
 
The old exchange environment versus modern exchange environment part 02#36
The old exchange environment versus modern exchange environment  part 02#36The old exchange environment versus modern exchange environment  part 02#36
The old exchange environment versus modern exchange environment part 02#36
 
My E-mail appears as spam - Troubleshooting path | Part 11#17
My E-mail appears as spam - Troubleshooting path | Part 11#17My E-mail appears as spam - Troubleshooting path | Part 11#17
My E-mail appears as spam - Troubleshooting path | Part 11#17
 
Feg chapter 04 - present perfect azar
Feg chapter 04 - present perfect azarFeg chapter 04 - present perfect azar
Feg chapter 04 - present perfect azar
 
How does sender verification work how we identify spoof mail) spf, dkim dmar...
How does sender verification work  how we identify spoof mail) spf, dkim dmar...How does sender verification work  how we identify spoof mail) spf, dkim dmar...
How does sender verification work how we identify spoof mail) spf, dkim dmar...
 
IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...
IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...
IND-2012-255 PUPS Subramaniapuram, Tenkasi -Pioneer to save Earthworm and use...
 
IND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic Rules
IND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic RulesIND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic Rules
IND-2012-300 Mother's Pet Kindergarten Nagpur - A U trurn for traffic Rules
 

Ähnlich wie PATHSenrich: A Web Service Prototype for Automatic Cultural Heritage Item Enrichment, @TPDL 2013

Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Contextcharper
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...pathsproject
 
EuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital HeritageEuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital HeritageMax Kaiser
 
Portrait Of Europeana As An Api
Portrait Of Europeana As An ApiPortrait Of Europeana As An Api
Portrait Of Europeana As An ApiEuropeana
 
EuropeanaLocal: what’s it all about?
EuropeanaLocal: what’s it all about?EuropeanaLocal: what’s it all about?
EuropeanaLocal: what’s it all about?EuropeanaLocal Project
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsVladimir Alexiev, PhD, PMP
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
77. newsletter d andrea2012
77. newsletter d andrea201277. newsletter d andrea2012
77. newsletter d andrea2012Andrea D'Andrea
 
Europeana Connect All-Staff Meeting
Europeana Connect All-Staff MeetingEuropeana Connect All-Staff Meeting
Europeana Connect All-Staff MeetingEuropeanaConnect
 
Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...
Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...
Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...Nick Jankowski
 
LoCloud - D3.3: Metadata Enrichment services
LoCloud - D3.3: Metadata Enrichment servicesLoCloud - D3.3: Metadata Enrichment services
LoCloud - D3.3: Metadata Enrichment serviceslocloud
 
Europeana vision - Web as Literature 2013
Europeana vision - Web as Literature 2013Europeana vision - Web as Literature 2013
Europeana vision - Web as Literature 2013Antoine Isaac
 
Case Study: Europeana API Implementation in Polish Digital Libraries
Case Study: Europeana API Implementation in Polish Digital LibrariesCase Study: Europeana API Implementation in Polish Digital Libraries
Case Study: Europeana API Implementation in Polish Digital LibrariesNeil Bates
 
Lodlam presentation v1.0 final al20151104
Lodlam presentation v1.0 final al20151104Lodlam presentation v1.0 final al20151104
Lodlam presentation v1.0 final al20151104Asa Letourneau
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the webChiara Del Vescovo
 
Institutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR ManagementInstitutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR ManagementPaolo Nesi
 
AAC Education Session
AAC Education Session AAC Education Session
AAC Education Session Antoine Isaac
 

Ähnlich wie PATHSenrich: A Web Service Prototype for Automatic Cultural Heritage Item Enrichment, @TPDL 2013 (20)

Of Cataloging & Context
Of Cataloging & ContextOf Cataloging & Context
Of Cataloging & Context
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...
 
The European Portal for documents and Archives: the APEnet Project
The European Portal for documents and Archives: the APEnet ProjectThe European Portal for documents and Archives: the APEnet Project
The European Portal for documents and Archives: the APEnet Project
 
EuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital HeritageEuropeanaConnect - Enhancing User Access to European Digital Heritage
EuropeanaConnect - Enhancing User Access to European Digital Heritage
 
Portrait Of Europeana As An Api
Portrait Of Europeana As An ApiPortrait Of Europeana As An Api
Portrait Of Europeana As An Api
 
EuropeanaLocal: what’s it all about?
EuropeanaLocal: what’s it all about?EuropeanaLocal: what’s it all about?
EuropeanaLocal: what’s it all about?
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom Views
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
77. newsletter d andrea2012
77. newsletter d andrea201277. newsletter d andrea2012
77. newsletter d andrea2012
 
Europeana Connect All-Staff Meeting
Europeana Connect All-Staff MeetingEuropeana Connect All-Staff Meeting
Europeana Connect All-Staff Meeting
 
Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...
Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...
Enhancing scholarly publishing, jankowski, tatum, tatum, & scharnhorst, pkp c...
 
LoCloud - D3.3: Metadata Enrichment services
LoCloud - D3.3: Metadata Enrichment servicesLoCloud - D3.3: Metadata Enrichment services
LoCloud - D3.3: Metadata Enrichment services
 
Europeana vision - Web as Literature 2013
Europeana vision - Web as Literature 2013Europeana vision - Web as Literature 2013
Europeana vision - Web as Literature 2013
 
Europeana and Researchers
Europeana and ResearchersEuropeana and Researchers
Europeana and Researchers
 
Case Study: Europeana API Implementation in Polish Digital Libraries
Case Study: Europeana API Implementation in Polish Digital LibrariesCase Study: Europeana API Implementation in Polish Digital Libraries
Case Study: Europeana API Implementation in Polish Digital Libraries
 
Lodlam presentation v1.0 final al20151104
Lodlam presentation v1.0 final al20151104Lodlam presentation v1.0 final al20151104
Lodlam presentation v1.0 final al20151104
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
Citizen Science Open Data
Citizen Science Open DataCitizen Science Open Data
Citizen Science Open Data
 
Institutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR ManagementInstitutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR Management
 
AAC Education Session
AAC Education Session AAC Education Session
AAC Education Session
 

Mehr von pathsproject

Generating Paths through Cultural Heritage Collections Latech2013 paper
Generating Paths through Cultural Heritage Collections Latech2013 paperGenerating Paths through Cultural Heritage Collections Latech2013 paper
Generating Paths through Cultural Heritage Collections Latech2013 paperpathsproject
 
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...pathsproject
 
PATHS state of the art monitoring report
PATHS state of the art monitoring reportPATHS state of the art monitoring report
PATHS state of the art monitoring reportpathsproject
 
Generating Paths through Cultural Heritage Collections, LATECH 2013 paper
Generating Paths through Cultural Heritage Collections, LATECH 2013 paperGenerating Paths through Cultural Heritage Collections, LATECH 2013 paper
Generating Paths through Cultural Heritage Collections, LATECH 2013 paperpathsproject
 
PATHS @ LATECH 2013
PATHS @ LATECH 2013PATHS @ LATECH 2013
PATHS @ LATECH 2013pathsproject
 
PATHS at the eChallenges conference
PATHS at the eChallenges conferencePATHS at the eChallenges conference
PATHS at the eChallenges conferencepathsproject
 
PATHS at the EAA conference 2013
PATHS at the EAA conference 2013PATHS at the EAA conference 2013
PATHS at the EAA conference 2013pathsproject
 
PATHS at the eCult dialogue day 2013
PATHS at the eCult dialogue day 2013PATHS at the eCult dialogue day 2013
PATHS at the eCult dialogue day 2013pathsproject
 
Comparing taxonomies for organising collections of documents presentation
Comparing taxonomies for organising collections of documents presentationComparing taxonomies for organising collections of documents presentation
Comparing taxonomies for organising collections of documents presentationpathsproject
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similaritypathsproject
 
A pilot on Semantic Textual Similarity
A pilot on Semantic Textual SimilarityA pilot on Semantic Textual Similarity
A pilot on Semantic Textual Similaritypathsproject
 
Comparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentsComparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentspathsproject
 
PATHS Final prototype interface design v1.0
PATHS Final prototype interface design v1.0PATHS Final prototype interface design v1.0
PATHS Final prototype interface design v1.0pathsproject
 
PATHS Evaluation of the 1st paths prototype
PATHS Evaluation of the 1st paths prototypePATHS Evaluation of the 1st paths prototype
PATHS Evaluation of the 1st paths prototypepathsproject
 
PATHS Second prototype-functional-spec
PATHS Second prototype-functional-specPATHS Second prototype-functional-spec
PATHS Second prototype-functional-specpathsproject
 
PATHS Final state of art monitoring report v0_4
PATHS  Final state of art monitoring report v0_4PATHS  Final state of art monitoring report v0_4
PATHS Final state of art monitoring report v0_4pathsproject
 
PATHS first paths prototype
PATHS first paths prototypePATHS first paths prototype
PATHS first paths prototypepathsproject
 
PATHS Content processing 2nd prototype-revised.v2
PATHS Content processing 2nd prototype-revised.v2PATHS Content processing 2nd prototype-revised.v2
PATHS Content processing 2nd prototype-revised.v2pathsproject
 
PATHS Content processing 1st prototype
PATHS  Content processing 1st prototypePATHS  Content processing 1st prototype
PATHS Content processing 1st prototypepathsproject
 
PATHS system architecture
PATHS system architecturePATHS system architecture
PATHS system architecturepathsproject
 

Mehr von pathsproject (20)

Generating Paths through Cultural Heritage Collections Latech2013 paper
Generating Paths through Cultural Heritage Collections Latech2013 paperGenerating Paths through Cultural Heritage Collections Latech2013 paper
Generating Paths through Cultural Heritage Collections Latech2013 paper
 
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
Supporting User's Exploration of Digital Libraries, Suedl 2012 workshop proce...
 
PATHS state of the art monitoring report
PATHS state of the art monitoring reportPATHS state of the art monitoring report
PATHS state of the art monitoring report
 
Generating Paths through Cultural Heritage Collections, LATECH 2013 paper
Generating Paths through Cultural Heritage Collections, LATECH 2013 paperGenerating Paths through Cultural Heritage Collections, LATECH 2013 paper
Generating Paths through Cultural Heritage Collections, LATECH 2013 paper
 
PATHS @ LATECH 2013
PATHS @ LATECH 2013PATHS @ LATECH 2013
PATHS @ LATECH 2013
 
PATHS at the eChallenges conference
PATHS at the eChallenges conferencePATHS at the eChallenges conference
PATHS at the eChallenges conference
 
PATHS at the EAA conference 2013
PATHS at the EAA conference 2013PATHS at the EAA conference 2013
PATHS at the EAA conference 2013
 
PATHS at the eCult dialogue day 2013
PATHS at the eCult dialogue day 2013PATHS at the eCult dialogue day 2013
PATHS at the eCult dialogue day 2013
 
Comparing taxonomies for organising collections of documents presentation
Comparing taxonomies for organising collections of documents presentationComparing taxonomies for organising collections of documents presentation
Comparing taxonomies for organising collections of documents presentation
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
 
A pilot on Semantic Textual Similarity
A pilot on Semantic Textual SimilarityA pilot on Semantic Textual Similarity
A pilot on Semantic Textual Similarity
 
Comparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentsComparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documents
 
PATHS Final prototype interface design v1.0
PATHS Final prototype interface design v1.0PATHS Final prototype interface design v1.0
PATHS Final prototype interface design v1.0
 
PATHS Evaluation of the 1st paths prototype
PATHS Evaluation of the 1st paths prototypePATHS Evaluation of the 1st paths prototype
PATHS Evaluation of the 1st paths prototype
 
PATHS Second prototype-functional-spec
PATHS Second prototype-functional-specPATHS Second prototype-functional-spec
PATHS Second prototype-functional-spec
 
PATHS Final state of art monitoring report v0_4
PATHS  Final state of art monitoring report v0_4PATHS  Final state of art monitoring report v0_4
PATHS Final state of art monitoring report v0_4
 
PATHS first paths prototype
PATHS first paths prototypePATHS first paths prototype
PATHS first paths prototype
 
PATHS Content processing 2nd prototype-revised.v2
PATHS Content processing 2nd prototype-revised.v2PATHS Content processing 2nd prototype-revised.v2
PATHS Content processing 2nd prototype-revised.v2
 
PATHS Content processing 1st prototype
PATHS  Content processing 1st prototypePATHS  Content processing 1st prototype
PATHS Content processing 1st prototype
 
PATHS system architecture
PATHS system architecturePATHS system architecture
PATHS system architecture
 

Kürzlich hochgeladen

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Kürzlich hochgeladen (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

PATHSenrich: A Web Service Prototype for Automatic Cultural Heritage Item Enrichment, @TPDL 2013

  • 1. PATHSenrich: A Web Service Prototype for Automatic Cultural Heritage Item Enrichment Eneko Agirre, Ander Barrena, Kike Fernandez, Esther Miranda, Arantxa Otegi, and Aitor Soroa IXA NLP Group, University of the Basque Country UPV/EHU arantza.otegi@ehu.es Abstract. Large amounts of cultural heritage material are nowadays available through online digital library portals. Most of these cultural items have short descriptions and lack rich contextual information. The PATHS project has developed experimental enrichment services. As a proof of concept, this paper presents a web service prototype which allows independent content providers to enrich cultural heritage items with a subset of the full functionality: links to related items in the collection and links to related Wikipedia articles. In the future we plan to provide more advanced functionality, as available offline for PATHS. 1 Introduction Large amounts of cultural heritage (CH) material are now available through online digital library portals, such as Europeana1. Europeana hosts millions of books, paintings, films, museum objects and archival records that have been digitised throughout Europe. Europeana collects contextual information or metadata about different types of content, which the users can use for their searches. The main strength of Europeana lays in the vast number of items it contains. Sometimes, though, this quantity comes at the cost of a restricted amount of metadata, with many items having very short descriptions and a lack of rich contextual information. One of the goals of the PATHS project2 is precisely to enrich CH items, using a selected subset of Europeana as a testbed[1]. Whithin the project, this enrichment will make possible to create a system that acts as an interactive personalised tour guide through Europeana collections, offering suggestions about items to look at and assist in their interpretation by providing relevant contextual information from related items within Europeana and items from external sources like Wikipedia. Users of such digital libraries may require information for purposes such as learning and seeking answers to questions. This additional information supports users in fulfilling their information need, as the evaluation of the first PATHS prototype shows [2]. In this paper we present a web service prototype which allows independent content providers to enrich CH items. Specifically, the service enriches the items 1 2 http://www.europeana.eu/portal/ http://www.paths-project.eu T. Aalberg et al. (Eds.): TPDL 2013, LNCS 8092, pp. 462–465, 2013. c Springer-Verlag Berlin Heidelberg 2013
  • 2. PATHSenrich: A Web Service Prototype for Automatic CH Item Enrichment 463 with two types of information. On the one hand, the item will be linked to similar items within the collection. On the other hand, the item will be linked to Wikipedia articles which are related to it. There have been many attempts to automatically enrich cultural heritage metadata. Some projects (for instance, MIMO-DB3 or MERLIN4 ) relate CH objects with terms of an external authority or vocabulary. Some others (like MACE5 or YUMA 6 ) adopt a collaborative annotation paradigm for metadata enrichment. To our knowledge, PATHS is the first project using semantic NLP processing to link CH items to similar items or external Wikipedia articles. The current service has limited bandwidth, and provides a selected subset of the enrichment functionality available internally in the PATHS project. The quality of the links produce is also slightly lower, although we plan to improve it in the short future. However, we think that the prototype is useful to demonstrate the potential to construct a web service for automatically enriching CH items with high quality information. 2 Demo Description The web service takes as input one CH item represented following the Europeana Data Model (EDM) in JSON format, as exported by the Europeana API v2.07 (a sample record is provided in the interface). The web service returns the following: – A list of 10 closely related items within the collection. – A list of Wikipedia pages which are related to the target item. Figure 1 shows a snapshot of the web service. The service is publicly accessible following the URL http://ixa2.si.ehu.es/paths_wp2/paths_wp2.pl. The enrichment is performed by analyzing the metadata associated with the item, i.e., the title of the item, its description, etc. The next sections briefly describe how this enrichment is performed. 2.1 Related Items within the Collection The list of related items is obtained by first creating a query with the content of the title, subject and description fields (stopwords are removed). The query is then posted to a SOLR search engine8 . The SOLR search engine accesses an index created with the subset of Europeana items already enriched offline within the PATHS project. In that way, the most related Europeana items in the subset are obtained, and the identifiers of those related items are listed. Note that the related items used internally in the PATHS project are produced using more sophisticated methods. Please refer to [1] for further details. 3 4 5 6 7 8 http://www.mimo-international.com http://www.ucl.ac.uk/ls/merlin http://www.mace-project.eu http://dme.ait.ac.at/annotation http://preview.europeana.eu/portal/api-introduction.html http://lucene.apache.org/solr/
  • 3. 464 E. Agirre et al. Fig. 1. Web service interface. It consists of a text area to introduce the input item in JSON format (top). The “Get EDM JSON example” button can be used to get an input example. Once a JSON record is typed, click “Process” button to get the output. The output (bottom) consists on a list of related items and background links. 2.2 Related Wikipedia Articles For linking the items to Wikipedia articles we follow an implementation similar to the method described in [3]. This method creates a dictionary, an association between string mentions with all possible articles the mention can refer to. Our dictionary is constructed using the title of the Wikipedia article, the redirect pages, the disambiguation pages and the anchor texts from Wikipedia links. Mentions are lower-cased and all text between parenthesis is removed. If the mention links to a disambiguation page, it is associated with all possible articles the disambiguation page points to. Besides, each association between a mention and article is scored with the prior probability, estimated as the number of times that the mention occurs in the anchor text of an article. Note that such dictionaries can disambiguate any mention, just returning the highest-scoring article for this particular mention. Once the dictionary is built, the web service analyzes the title, subject and description fields of the CH item and matches the longest substring within those fields with entries in the dictionary. When a match is found, the Wikipedia article with highest score for this entry is returned. Note that the links to Wikipedia in the PATHS project are produced using more sophisticated methods. Please refer to [1] for further details.
  • 4. PATHSenrich: A Web Service Prototype for Automatic CH Item Enrichment 3 465 Conclusions and Future Work This paper presents a web service prototype which automatically enriches CH items with metadata. The web service is inspired in the enrichment work carried out in the PATHS project, but, contrary to the batch methodology used in the project, this enrichment is performed online. The prototype has been designed for demonstration purposes, to showcase the feasibility of providing full-fledged automatic enrichment. Our plans for the future include moving the offline enrichment services which are currently being evaluated in the PATHS project to the web service. In the case of related Wikipedia articles, we will take into account the context of the matched entities, which improves the quality of the links [4], and we will include a filtering algorithm to discard entities that are not relevant. Regarding related items, we will classify them according to the type of relation [5]. In addition we plan to automatically organize the items hierarchically, according to a Wikipediabased vocabulary [6]. Acknowledgements. The research leading to these results was carried out as part of the PATHS project (http://www.paths-project.eu) funded by European Communitys Seventh Framework Programme (FP7/2007- 2013) under grant agreement no. 270082. The work has been also funded by the Basque Government (project IBILBIDE, SAIOTEK S-PE12UN089). References 1. Otegi, A., Agirre, E., Soroa, A., Aletras, N., Chandrinos, C., Fernando, S., GonzalezAgirre, A.: Report accompanying D2.2: Processing and Representation of Content for Second Prototype. PATHS Project Deliverable (2012), http://www.paths-project.eu/eng/content/download/2489/18113/version/2/ file/D2.2.Content+Processing-2nd+Prototype-revised.v2.pdf 2. Griffiths, J., Goodale, P., Minelli, S., de Polo, A., Agerri, R., Soroa, A., Hall, M., Bergheim, S.R., Chandrinos, K., Chryssochoidis, G., Fernie, K., Usher, T.: D5.1: Evaluation of the first PATHS prototype. PATHS Project Deliverable (2012), http://www.paths-project.eu/eng/Resources/ D5.1-Evaluation-of-the-1st-PATHS-Prototype 3. Chang, A.X., Spitkovsky, V.I., Yeh, E., Agirre, E., Manning, C.D.: Stanford-UBC entity linking at TAC-KBP. In: Proceedings of TAC 2010, Gaithersburg, Maryland, USA (2010) 4. Han, X., Sun, L.: A Generative Entity-Mention Model for Linking Entities with Knowledge Base. In: Proceedings of the ACL, Portland, Oregon, USA (2011) 5. Agirre, E., Aletras, N., Gonzalez-Agirre, A., Rigau, G., Stevenson, M.: UBC UOSTYPED: Regression for typed-similarity. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), Atlanta, Georgia, USA (2013) 6. Fernando, S., Hall, M., Agirre, E., Soroa, A., Clough, P., Stevenson, M.: Comparing Taxonomies for Organising Collections of Documents. In: Proceedings of COLING 2012, Mumbai, India (2013)