SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Creation of custom KOS-based
recommendation systems
Thomas Lüke, Wilko van Hoek,
Philipp Schaer, Philipp Mayr
NKOS-Workshop @ TPDL 2012
Paphos, Cyprus, 2012-09-27
Thomas.Lueke@gesis.org
1. Motivation: Finding the matching terms in IR
2. Use Cases for recommendation systems
3. Creating custom recommenders
• Workflow
• Interface
4. Demonstration
5. Conclusion
2
Overview
3
Motivation
• Databases are vastly growing
• empty result sets are rare
• too unspecific results are a problem
• Users need to refine their search
4
Motivation
5
see Hienert et al., 2011
Standard Search Term Recommender (STR)
• Maps any query term onto controlled Thesaurus-concepts
• Trained with many different databases and vocabularies
(SOLIS, CSA-SA, SPOLIT, FIS Bildung, …)
• Real Life usage: Portal Sowiport (cf. TPDL 2011: Hienert et al.)
Use-Case 1: Manual Query Expansion
6
Interactive Prototyp: http://www.gesis.org/beta/prototypen/irm
See NKOS presentation Mayr et al., 2010
7
Use-Case 2: Automatic Query Expansion
8
Use-Case Evaluation
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
83 84 88 93 96 105 110 153 166 173 average
SOLR
STR
See (Mutschke et. al: Science models as value-added services for scholarly information systems. Scientometrics. 89, 349–
364 (2011).
Result: On Average the usage of an STR can improve the search
process
9
• Recommender Service in IRM I was based on commercial software
• Goals in IRM II:
• Replacing old technology with new self-written version
• Making technology available to others by being open-source
• Provide Web-Interfaces to use recommenders services
• Allow the creation of custom recommenders on our servers
• Why Custom STRs?
• The more specific the dataset, the more specific the
recommendations
• Customized for your specific information need
 see our Poster/Paper
(Improving Retrieval Results with Discipline-specific Query Expansion, TPDL
2012, Lüke et. Al, http://arxiv.org/abs/1206.2126)
Creating custom recommenders
10
header:
identifier : oai:gesis.izsoz.de:19389
datestamp : 2011-01-10T13:46:00Z
setSpec : SSOAR
metadata:
dc:
identifier: http://nbn-resolving.de/urn:nbn:de:0168-ssoar-193894
title: How can international donors promote transboundary water management?
creator: Mostert, Erik
creator: Deutsches Institut für Entwicklungspolitik gGmbH
subject: Political science (320)
subject: Life sciences, biology (570)
subject: International Relations, International Politics, Development Policy (10505)
subject: Ecology, Environment (20900)
subject: Management; Afrika; Entwicklung; Entwicklungsland; Akteur; Wasser
source: Bonn
source: DIE Discussion Paper (1860-0441) 8/2005
description: "This paper discusses how international donors can promote the
development of transboundary water management. It assumes, first, that
cooperation will take place whenever the major stakeholders consider cooperation
to be a better option than non-cooperation. The perceptions and motivations of
the stakeholders are therefore crucial. Secondly, this paper assumes that the
major stakeholders are not 'states', but specific groups and individuals:
individual politicians, sectoral government bureaucracies, regional and local
governments, farmers, electricity companies, etc. Some of these may be involved
in the international negotiations themselves, others may be needed to get
international agreements ratified or implemented, and still others may be
affected by transboundary water management but lack the means to exert any
influence." (author's abstract)
language: English
rights: Deposit Licence - No Redistribution, No Modifications
contributor: SSOAR - Social Science Open Access Repository
date: 10.01.2011 13:46
OAI-PMH Dublin Core Data
Co-Occurence Analysis of
free and controlled
vocabulary (e.g. using
Jaccard, NWD, Dice etc.)
Free Terms in Description
Free Terms in Title
Controlled Vocabulary
11
Web Frontend
OAI
harvester
Pre-Processing
RESTfull API
Thesaurus
(optional)
Documents
Provider DL
Database
Repository
Workflow
12
13
RESTful API Webservice
14
Live Demo
15
Conclusion
As part of the IRM II project we have developed a system that
• is based on the free Apache 2.0 License
• may be used on our servers or can be set up on your own
system
• uses the widely accepted Dublin Core standard via a OAI-PMH
interface
• will now be beta-tested to estimate hardware requirements and
further evaluate performance of custom sets
Got your attention? 
Thomas.Lueke@gesis.org or Philipp.Schaer@gesis.org for beta-test
accounts
Further Information on our Project-Website:
http://www.gesis.org/en/research/external-funding-projects/projektuebersicht-drittmittel/irm2/
16
Thank you for
your attention!
Any Questions?
The projects IRM I and IRM II
• DFG (German Research Foundation) Funding (2009-2013)
• IRM = Information Retrieval Mehrwertdienste
• Implementation and Evaluation of value added services for the
retrieval in digital libraries
• Main idea: Usage of scientific models in IR
– Bibliometrical analysis of core journals
– Centrality scores in author networks
 Co-Occurence analysis of subjects
• Our goal is the creation of reusable services
http://www.gesis.org/en/research/external-funding-projects/projektuebersicht-
drittmittel/irm2/
17
18
Improvement in an individual query (GIRT 131). Original Query: bilingual
education.
Table 1: Top 4 Recommendations of the 3 STRs
Exp.
Type AP rPreciso
n p@5 p@10 p@20
No Exp. 0.039 0.127 0.4 0.3 0.2
gSTR 0.072 0.144 0.6 0.6 0.4
tSTR 0.076 0.161 0.8 0.6 0.45
bSTR 0.147 0.161 1 1 0.85
# General (gSTR) Topic-fitting (tSTR) Best-performing
(bSTR)
1 Multilingualism Child Multilingualism
2 Child School Speech
3 Speech Multilingualism Ethnic Group
4 Intercultural Education Germany Minority
Table 2: Statisics (bold font means further improvement)
19
Exp. Type MAP rPreciso
n p@5 p@10 p@20
gSTR 0.155 0.221 0.548 0.509 0.449
tSTR 0.159 0.224 0.578* 0.542** 0.460
bSTR 0.179** 0.233** 0.658** 0.601** 0.512**
• A simple heuristic is used to select the best fitting STR for
each topic (tSTR). We also list the general STR (gSTR)
as baseline and the best-performing STR as comparison.
• To measure retrieval performance we use 100 topics
from the GIRT corpus, measurements: MAP, rPrecision
and p@{5,10,20}, * α = .05, ** α = .01

Weitere ähnliche Inhalte

Was ist angesagt?

SDI – National to Global: perspectives from the UK academic sector
SDI – National to Global: perspectives from the UK academic sector SDI – National to Global: perspectives from the UK academic sector
SDI – National to Global: perspectives from the UK academic sector EDINA, University of Edinburgh
 
Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record EDINA, University of Edinburgh
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghEDINA, University of Edinburgh
 
Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...EDINA, University of Edinburgh
 
Piloting an E-journals Preservation Registry Service: overview of PEPRS
Piloting an E-journals Preservation Registry Service: overview of PEPRSPiloting an E-journals Preservation Registry Service: overview of PEPRS
Piloting an E-journals Preservation Registry Service: overview of PEPRSSUNCAT
 
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience EDINA, University of Edinburgh
 
EOSC FAIR Data Session - EOSC Stakeholders Forum 2018
EOSC FAIR Data Session - EOSC Stakeholders Forum 2018EOSC FAIR Data Session - EOSC Stakeholders Forum 2018
EOSC FAIR Data Session - EOSC Stakeholders Forum 2018EOSCpilot .eu
 
EOSC-MAR-update.pptx
EOSC-MAR-update.pptxEOSC-MAR-update.pptx
EOSC-MAR-update.pptxSarah Jones
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsRobin Rice
 
Shibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDIShibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDIEDINA, University of Edinburgh
 
DIY RDM Training Kit for Librarians (PK)
DIY RDM Training Kit for Librarians (PK)DIY RDM Training Kit for Librarians (PK)
DIY RDM Training Kit for Librarians (PK)Robin Rice
 
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...LIBER Europe
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareRobin Rice
 
Tony Ross-Hellauer: eInfrastructure for Open Science
Tony Ross-Hellauer: eInfrastructure for Open ScienceTony Ross-Hellauer: eInfrastructure for Open Science
Tony Ross-Hellauer: eInfrastructure for Open ScienceOpenAIRE
 

Was ist angesagt? (20)

Research Data MANTRA Demo
Research Data MANTRA DemoResearch Data MANTRA Demo
Research Data MANTRA Demo
 
Research Data Management and Spatial Data
Research Data Management and Spatial DataResearch Data Management and Spatial Data
Research Data Management and Spatial Data
 
Research Data Management: Policy Development
Research Data Management: Policy DevelopmentResearch Data Management: Policy Development
Research Data Management: Policy Development
 
SDI – National to Global: perspectives from the UK academic sector
SDI – National to Global: perspectives from the UK academic sector SDI – National to Global: perspectives from the UK academic sector
SDI – National to Global: perspectives from the UK academic sector
 
Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record Actions to Ensure the Integrity and Continuity of the Scholarly Record
Actions to Ensure the Integrity and Continuity of the Scholarly Record
 
Research Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of EdinburghResearch Data Management (RDM) Initiatives at the University of Edinburgh
Research Data Management (RDM) Initiatives at the University of Edinburgh
 
Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...Where data and journal content collide: what does it mean to ‘publish your da...
Where data and journal content collide: what does it mean to ‘publish your da...
 
Piloting an E-journals Preservation Registry Service: overview of PEPRS
Piloting an E-journals Preservation Registry Service: overview of PEPRSPiloting an E-journals Preservation Registry Service: overview of PEPRS
Piloting an E-journals Preservation Registry Service: overview of PEPRS
 
National Activities and the UK LOCKSS Alliance
National Activities and the UK LOCKSS AllianceNational Activities and the UK LOCKSS Alliance
National Activities and the UK LOCKSS Alliance
 
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
Shibboleth Access Management Federations and Secure SDI: ESDIN Experience
 
EOSC FAIR Data Session - EOSC Stakeholders Forum 2018
EOSC FAIR Data Session - EOSC Stakeholders Forum 2018EOSC FAIR Data Session - EOSC Stakeholders Forum 2018
EOSC FAIR Data Session - EOSC Stakeholders Forum 2018
 
EOSC-MAR-update.pptx
EOSC-MAR-update.pptxEOSC-MAR-update.pptx
EOSC-MAR-update.pptx
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjects
 
MANTRA for Change
MANTRA for ChangeMANTRA for Change
MANTRA for Change
 
Shibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDIShibboleth Access Management Federations as an Organisational Model for SDI
Shibboleth Access Management Federations as an Organisational Model for SDI
 
DIY RDM Training Kit for Librarians (PK)
DIY RDM Training Kit for Librarians (PK)DIY RDM Training Kit for Librarians (PK)
DIY RDM Training Kit for Librarians (PK)
 
Ready, Set, GO FAIR
Ready, Set, GO FAIRReady, Set, GO FAIR
Ready, Set, GO FAIR
 
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
Knowledge Exchange Consensus: Monitoring of Open Access Publications and Cost...
 
Six Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShareSix Use Cases for Edinburgh DataShare
Six Use Cases for Edinburgh DataShare
 
Tony Ross-Hellauer: eInfrastructure for Open Science
Tony Ross-Hellauer: eInfrastructure for Open ScienceTony Ross-Hellauer: eInfrastructure for Open Science
Tony Ross-Hellauer: eInfrastructure for Open Science
 

Andere mochten auch

Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...GESIS
 
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshopIntroduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshopGESIS
 
Non-textual ranking in Digital Libraries
Non-textual ranking in Digital LibrariesNon-textual ranking in Digital Libraries
Non-textual ranking in Digital LibrariesGESIS
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluatedGESIS
 
Combining Bibliometrics and Information Retrieval
Combining Bibliometrics and Information RetrievalCombining Bibliometrics and Information Retrieval
Combining Bibliometrics and Information RetrievalGESIS
 
Relevance distributions across Bradford Zones: Can Bradfordizing improve search?
Relevance distributions across Bradford Zones: Can Bradfordizing improve search?Relevance distributions across Bradford Zones: Can Bradfordizing improve search?
Relevance distributions across Bradford Zones: Can Bradfordizing improve search?GESIS
 
Past, present and future of scientific information
Past, present and future of scientific informationPast, present and future of scientific information
Past, present and future of scientific informationGESIS
 
Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...
Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...
Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...GESIS
 
Pennants for Descriptors
Pennants for DescriptorsPennants for Descriptors
Pennants for DescriptorsGESIS
 
Understanding the Marketing Technology Landscape
Understanding the Marketing Technology LandscapeUnderstanding the Marketing Technology Landscape
Understanding the Marketing Technology LandscapeMichael Krigsman
 

Andere mochten auch (10)

Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...
 
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshopIntroduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
Introduction of the Bibliometric-enhanced Information Retrieval (BIR) workshop
 
Non-textual ranking in Digital Libraries
Non-textual ranking in Digital LibrariesNon-textual ranking in Digital Libraries
Non-textual ranking in Digital Libraries
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
 
Combining Bibliometrics and Information Retrieval
Combining Bibliometrics and Information RetrievalCombining Bibliometrics and Information Retrieval
Combining Bibliometrics and Information Retrieval
 
Relevance distributions across Bradford Zones: Can Bradfordizing improve search?
Relevance distributions across Bradford Zones: Can Bradfordizing improve search?Relevance distributions across Bradford Zones: Can Bradfordizing improve search?
Relevance distributions across Bradford Zones: Can Bradfordizing improve search?
 
Past, present and future of scientific information
Past, present and future of scientific informationPast, present and future of scientific information
Past, present and future of scientific information
 
Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...
Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...
Introduction of the 3rd International Workshop on Bibliometric-enhanced Infor...
 
Pennants for Descriptors
Pennants for DescriptorsPennants for Descriptors
Pennants for Descriptors
 
Understanding the Marketing Technology Landscape
Understanding the Marketing Technology LandscapeUnderstanding the Marketing Technology Landscape
Understanding the Marketing Technology Landscape
 

Ähnlich wie Creation of custom KOS-based recommendation systems

Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionMartin Donnelly
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)Martin Donnelly
 
Report of the second FAIRDOM foundry
Report of the second FAIRDOM foundryReport of the second FAIRDOM foundry
Report of the second FAIRDOM foundryFAIRDOM
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020OpenAIRE
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...Nancy Pontika
 
Developing a Data Management Plan
Developing a Data Management PlanDeveloping a Data Management Plan
Developing a Data Management PlanMartin Donnelly
 
Using DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data IntegrationUsing DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data IntegrationMartin Kaltenböck
 
H2020 open-data-pilot
H2020 open-data-pilotH2020 open-data-pilot
H2020 open-data-pilotSarah Jones
 
Horizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesMartin Donnelly
 
Open Source & Open Data Session report from imaGIne 2014 Conference
Open Source & Open Data Session report from imaGIne 2014 ConferenceOpen Source & Open Data Session report from imaGIne 2014 Conference
Open Source & Open Data Session report from imaGIne 2014 ConferenceGSDI Association
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
Opportunity and risk in social computing environments
Opportunity and risk in social computing environmentsOpportunity and risk in social computing environments
Opportunity and risk in social computing environmentsHazel Hall
 
CINECA webinar slides: FAIR software tools
CINECA webinar slides: FAIR software toolsCINECA webinar slides: FAIR software tools
CINECA webinar slides: FAIR software toolsCINECAProject
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
FAIR workshop Vienna
FAIR workshop ViennaFAIR workshop Vienna
FAIR workshop ViennaSarah Jones
 
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...OpenAIRE
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 

Ähnlich wie Creation of custom KOS-based recommendation systems (20)

Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introduction
 
Open Data: Strategies for Research Data Management (and Planning)
Open Data: Strategies for Research Data  Management (and Planning)Open Data: Strategies for Research Data  Management (and Planning)
Open Data: Strategies for Research Data Management (and Planning)
 
Report of the second FAIRDOM foundry
Report of the second FAIRDOM foundryReport of the second FAIRDOM foundry
Report of the second FAIRDOM foundry
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
Developing a Data Management Plan
Developing a Data Management PlanDeveloping a Data Management Plan
Developing a Data Management Plan
 
Using DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data IntegrationUsing DBpedia for Thesaurus Management and Linked Open Data Integration
Using DBpedia for Thesaurus Management and Linked Open Data Integration
 
Ima g ine2014_8c1report
Ima g ine2014_8c1reportIma g ine2014_8c1report
Ima g ine2014_8c1report
 
H2020 open-data-pilot
H2020 open-data-pilotH2020 open-data-pilot
H2020 open-data-pilot
 
Horizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandatesHorizon 2020 open access and open data mandates
Horizon 2020 open access and open data mandates
 
Open Source & Open Data Session report from imaGIne 2014 Conference
Open Source & Open Data Session report from imaGIne 2014 ConferenceOpen Source & Open Data Session report from imaGIne 2014 Conference
Open Source & Open Data Session report from imaGIne 2014 Conference
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
Opportunity and risk in social computing environments
Opportunity and risk in social computing environmentsOpportunity and risk in social computing environments
Opportunity and risk in social computing environments
 
ELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - ExamplarsELIXIR FAIR Activities - Examplars
ELIXIR FAIR Activities - Examplars
 
CINECA webinar slides: FAIR software tools
CINECA webinar slides: FAIR software toolsCINECA webinar slides: FAIR software tools
CINECA webinar slides: FAIR software tools
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
FAIR workshop Vienna
FAIR workshop ViennaFAIR workshop Vienna
FAIR workshop Vienna
 
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
Overview of the data pilot and OpenAIRE tools, Elly Dijk and Marjan Grootveld...
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 

Mehr von GESIS

10th BIR Workshop @ECIR 2020: introduction
10th  BIR Workshop @ECIR 2020: introduction10th  BIR Workshop @ECIR 2020: introduction
10th BIR Workshop @ECIR 2020: introductionGESIS
 
From closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsFrom closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsGESIS
 
Highly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeHighly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeGESIS
 
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...GESIS
 
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsBibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsGESIS
 
Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”GESIS
 
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...GESIS
 
Searching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesSearching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesGESIS
 
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenBedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenGESIS
 
Contextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabContextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabGESIS
 
41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)GESIS
 
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...GESIS
 
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...GESIS
 
Challenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesChallenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesGESIS
 
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...GESIS
 
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...GESIS
 
Recent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalRecent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalGESIS
 
Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...GESIS
 
Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016GESIS
 
Recent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization SystemsRecent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization SystemsGESIS
 

Mehr von GESIS (20)

10th BIR Workshop @ECIR 2020: introduction
10th  BIR Workshop @ECIR 2020: introduction10th  BIR Workshop @ECIR 2020: introduction
10th BIR Workshop @ECIR 2020: introduction
 
From closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsFrom closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journals
 
Highly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeHighly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over time
 
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
 
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsBibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
 
Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”
 
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
 
Searching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesSearching beyond datasets in the Social Sciences
Searching beyond datasets in the Social Sciences
 
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenBedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
 
Contextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabContextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living Lab
 
41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)
 
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
 
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
 
Challenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesChallenges in Extracting and Managing References
Challenges in Extracting and Managing References
 
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
 
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
 
Recent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalRecent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information Retrieval
 
Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...
 
Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016
 
Recent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization SystemsRecent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization Systems
 

Kürzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Kürzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Creation of custom KOS-based recommendation systems

  • 1. Creation of custom KOS-based recommendation systems Thomas Lüke, Wilko van Hoek, Philipp Schaer, Philipp Mayr NKOS-Workshop @ TPDL 2012 Paphos, Cyprus, 2012-09-27 Thomas.Lueke@gesis.org
  • 2. 1. Motivation: Finding the matching terms in IR 2. Use Cases for recommendation systems 3. Creating custom recommenders • Workflow • Interface 4. Demonstration 5. Conclusion 2 Overview
  • 3. 3
  • 4. Motivation • Databases are vastly growing • empty result sets are rare • too unspecific results are a problem • Users need to refine their search 4
  • 6. Standard Search Term Recommender (STR) • Maps any query term onto controlled Thesaurus-concepts • Trained with many different databases and vocabularies (SOLIS, CSA-SA, SPOLIT, FIS Bildung, …) • Real Life usage: Portal Sowiport (cf. TPDL 2011: Hienert et al.) Use-Case 1: Manual Query Expansion 6
  • 7. Interactive Prototyp: http://www.gesis.org/beta/prototypen/irm See NKOS presentation Mayr et al., 2010 7 Use-Case 2: Automatic Query Expansion
  • 8. 8 Use-Case Evaluation 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 83 84 88 93 96 105 110 153 166 173 average SOLR STR See (Mutschke et. al: Science models as value-added services for scholarly information systems. Scientometrics. 89, 349– 364 (2011). Result: On Average the usage of an STR can improve the search process
  • 9. 9 • Recommender Service in IRM I was based on commercial software • Goals in IRM II: • Replacing old technology with new self-written version • Making technology available to others by being open-source • Provide Web-Interfaces to use recommenders services • Allow the creation of custom recommenders on our servers • Why Custom STRs? • The more specific the dataset, the more specific the recommendations • Customized for your specific information need  see our Poster/Paper (Improving Retrieval Results with Discipline-specific Query Expansion, TPDL 2012, Lüke et. Al, http://arxiv.org/abs/1206.2126) Creating custom recommenders
  • 10. 10 header: identifier : oai:gesis.izsoz.de:19389 datestamp : 2011-01-10T13:46:00Z setSpec : SSOAR metadata: dc: identifier: http://nbn-resolving.de/urn:nbn:de:0168-ssoar-193894 title: How can international donors promote transboundary water management? creator: Mostert, Erik creator: Deutsches Institut für Entwicklungspolitik gGmbH subject: Political science (320) subject: Life sciences, biology (570) subject: International Relations, International Politics, Development Policy (10505) subject: Ecology, Environment (20900) subject: Management; Afrika; Entwicklung; Entwicklungsland; Akteur; Wasser source: Bonn source: DIE Discussion Paper (1860-0441) 8/2005 description: "This paper discusses how international donors can promote the development of transboundary water management. It assumes, first, that cooperation will take place whenever the major stakeholders consider cooperation to be a better option than non-cooperation. The perceptions and motivations of the stakeholders are therefore crucial. Secondly, this paper assumes that the major stakeholders are not 'states', but specific groups and individuals: individual politicians, sectoral government bureaucracies, regional and local governments, farmers, electricity companies, etc. Some of these may be involved in the international negotiations themselves, others may be needed to get international agreements ratified or implemented, and still others may be affected by transboundary water management but lack the means to exert any influence." (author's abstract) language: English rights: Deposit Licence - No Redistribution, No Modifications contributor: SSOAR - Social Science Open Access Repository date: 10.01.2011 13:46 OAI-PMH Dublin Core Data Co-Occurence Analysis of free and controlled vocabulary (e.g. using Jaccard, NWD, Dice etc.) Free Terms in Description Free Terms in Title Controlled Vocabulary
  • 12. 12
  • 15. 15 Conclusion As part of the IRM II project we have developed a system that • is based on the free Apache 2.0 License • may be used on our servers or can be set up on your own system • uses the widely accepted Dublin Core standard via a OAI-PMH interface • will now be beta-tested to estimate hardware requirements and further evaluate performance of custom sets Got your attention?  Thomas.Lueke@gesis.org or Philipp.Schaer@gesis.org for beta-test accounts Further Information on our Project-Website: http://www.gesis.org/en/research/external-funding-projects/projektuebersicht-drittmittel/irm2/
  • 16. 16 Thank you for your attention! Any Questions?
  • 17. The projects IRM I and IRM II • DFG (German Research Foundation) Funding (2009-2013) • IRM = Information Retrieval Mehrwertdienste • Implementation and Evaluation of value added services for the retrieval in digital libraries • Main idea: Usage of scientific models in IR – Bibliometrical analysis of core journals – Centrality scores in author networks  Co-Occurence analysis of subjects • Our goal is the creation of reusable services http://www.gesis.org/en/research/external-funding-projects/projektuebersicht- drittmittel/irm2/ 17
  • 18. 18 Improvement in an individual query (GIRT 131). Original Query: bilingual education. Table 1: Top 4 Recommendations of the 3 STRs Exp. Type AP rPreciso n p@5 p@10 p@20 No Exp. 0.039 0.127 0.4 0.3 0.2 gSTR 0.072 0.144 0.6 0.6 0.4 tSTR 0.076 0.161 0.8 0.6 0.45 bSTR 0.147 0.161 1 1 0.85 # General (gSTR) Topic-fitting (tSTR) Best-performing (bSTR) 1 Multilingualism Child Multilingualism 2 Child School Speech 3 Speech Multilingualism Ethnic Group 4 Intercultural Education Germany Minority Table 2: Statisics (bold font means further improvement)
  • 19. 19 Exp. Type MAP rPreciso n p@5 p@10 p@20 gSTR 0.155 0.221 0.548 0.509 0.449 tSTR 0.159 0.224 0.578* 0.542** 0.460 bSTR 0.179** 0.233** 0.658** 0.601** 0.512** • A simple heuristic is used to select the best fitting STR for each topic (tSTR). We also list the general STR (gSTR) as baseline and the best-performing STR as comparison. • To measure retrieval performance we use 100 topics from the GIRT corpus, measurements: MAP, rPrecision and p@{5,10,20}, * α = .05, ** α = .01