Crowdsourcing and Semantic Enrichments for European Cultural Heritage, by Sergiu Gordea, Michela Vignoli and Roman Graf (Austrian Institute of Technology) - 27 September 2016
Crowdsourcing and Semantic Enrichments for European Cultural Heritage
1. AIT Austrian Institute of Technology
Crowdsourcing and Semantic Enrichments for
European Cultural Heritage
Sergiu Gordea, Michela Vignoli and Roman Graf
CAIRA@KI 2016 Klagenfurt am Wörthersee, 27.09
2. Agenda
• Europeana Digital Service Infrastructure
• Search and browsing in multilingual Cultural Heritage repository
• Semantic enrichments in Thematic Collections
• Crowdsourcing semantic enrichments in EU Sounds
• Vocabulary/Thesauri alignment approach
• Experimental Results
• Vocabulary alignment for music instruments
• Using categorization information
• Using text search
• Conclusions and future work
Europeanasounds.eu
3. Europeana Digital Service Infrastructure
The Platform for Europe’s Digital Cultural Heritage
Aggregates metadata:
• From all EU countries
• ~3,500 galleries, libraries,
archives and museums
• More than 52M objects
• In about 50 languages
• Huge amount of references to
places, agents, concepts, time
Source: [Manguinhas et al. 1]
5. Europeana Digital Service Infrastructure
Free text search: Klavierkonzert (von österreichische Komponist?)
6. Semantic search
Query: Piano concerto of austrian composers
Music instrument:
Piano
Music genre:
Concerto
Agent:
*
Role:
Composer
Place:
Austria
Europeanasounds.eu
7. Semantic search
User Expectations
• Complete search
• „All piano concerts of all/any austrian composer“
• User input
• in preferred (mother) language
• Records in all/any languages
• Metadata language vs. content language
• Spoken language vs „Technical languages“ (e.g. music notations)
• All content types
• Text
• Image
• Audio
• Video
Europeanasounds.eu
8. Semantic enrichments
Huge effort
• Automatic processing
• Domain Expert
Knowledge
• User Validation
Domain specific
• Thematic collections
• Multilingual
vocabularies/thesauri
EuSounds
• Music instruments
• Music genres
Europeanasounds.eu
Reference: [Manguinhas et al. 2]
10. Cultuurlink
Freely available
• as an online open service that any user can use
Users have the ability to design and experiment with different
alignment strategies
• helps the task of discovering new alignments between two
vocabularies
Manual control
• users can decide which alignments are correct and can assign a
specific meaning (e.g. skos:exactMatch, skos:related,
skos:broadMatch)
Europeanasounds.eu
11. Experimental Results
The British Library (BL) participated with 3 collections:
• A selection of Asian instruments (1,099 records) from the "Colin Huehns
Asia Collection"
• a selection from the “Peter Cooke Uganda Collection” (1,312 records)
• and the “Keith Summers English Folk Music Collection” (1,326 records)
The Centre de Recherche en Ethnomusicologie (CREM)
• participated with a test collection of 36 records published in the CD “Musical
Instruments of the World”
The Maison Méditerranéenne des Sciences de l'Homme
(MMSH)
• participated with a collection of 25 records about folk music
The Netherlands Institute of Sound and Vision (NISV)
Europeanasounds.eu
Reference: [Manguinhas et al. 1]
13. Experimental Results
Austrian National Library Dataset
• 1396 records of letters and music scores of classic music composers
• References of music instruments are available in title and description only
• Music instrument names available in different languages (german, italian/latin,
french)
141
39
668 674
0
100
200
300
400
500
600
700
800
Music instruments
terms
Music instruments Instrument Taggs Instrument Family
Tags
Europeanasounds.eu
14. Conclusions & Future Work
• Semantic enrichments for Europeana
• Targeting Thematic Collections
• Infrustructure to support generation and acquisition
• Europeana Entity Collection
• Preliminary experiments
• Small scale
• High precision enrichments
• Future work
• Validation through crowdsourcing
• Scalability to all Europeana Sounds dataset (300.000+)
• Music genres tagging
Europeanasounds.eu
15. AIT Austrian Institute of Technology
your ingenious partner
Thank you!
Sergiu Gordea
Sergiu.Gordea@ait.ac.at
16. References
[Manguinhas et al. 1] Hugo Manguinhas, Valentine Charles, Antoine
Isaac, Tom Miles, Aude Lima, Ariane Néroulidis, Véronique Ginouvès,
Dimitra Atsidis, Maarten Brinkerink, Michiel Hildebrand, Sergiu Gordea:
Linking subject labels in Cultural Heritage Metadata to MIMO vocabulary
using CultuurLink, NKOS 2016, Hannover
[Manguinhas et al 2] Hugo Manguinhas, Sergiu Gordea, Antoine Isaac,
Alessio Piccioli, Giulio Andreini, Francesca Di Donato, Remy Gardien,
Maarten Brinkerink: Challenges on modeling annotations in the
Europeana Sounds project, iAnnotate 2016, Berlin