SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
1
Non-textual ranking in
digital libraries
Philipp Mayr
Hochschule Darmstadt
Jour fixe ISE am 18.11.2009
Hochschule Darmstadt
Fachbereich Media
Slides in cooperation with Peter Mutschke & Philipp Schaer
2
Agenda
• Introduction
• Ranking in DL
• IRM project
• Non-textual ranking in IRM
• Results
• Conclusion & Demo
3
Background I
Database perspective:
• Large and heterogeneous document sets for
subject specific questions
• Various relevant and accessible databases
for a topic
• Focus on bibliographic databases (metadata)
• journal articles
• monographs
4
Background II
User perspective:
• Relevant & qualitative documents (relevance ranking)
• Comprehensive search: documents from other fields
• Flexible search systems: alternative search strategies
and techniques (e.g. Berrypicking)
• Value-added, e.g. direct access to fulltexts or metrics
like citation counts in Google Scholar
5
Digital Libraries
• Indexing & Abstracting databases
• Library catalogues
• Full texts
• Links to online resources
• Data
• Digital objects
• …
6
Ranking
Models:
• Exact match vs. best match (e.g. tf-idf)
• Sorting vs. ranking
Textual vs. non-textual ranking
7
Ranking: non-textual
Link analysis (PageRank, HITS)
Relevance feedback (user feedback)
Popularity (documents accessed)
8
Ranking in Digital Libraries
9
Ranking in Digital Libraries
10
Ranking in Digital Libraries
11
Non-textual ranking in DL
• Link analysis (PageRank, HITS)
• Relevance feedback (user feedback)
• Popularity (documents accessed)
BUT we have:
• High quality metadata
• Controlled vocabularies
• Maintained (curated) collections
are problematic
in DL
12
Project IRM
13
Value Added Services for IR Systems
Major problem areas of scholarly IR systems (Krause 2007):
1. search term vagueness
2. information overload by large result sets
IRM services → structural attributes of the science system :
• (1) Search Term Recommender: more appropriate terms from
controlled vocabulary (co-word analysis)
• (2a) Bradfordizing: re-ranking by core journals (bibliometrics)
• (2b) Author Centrality: re-ranking by centrality in co-authorship
networks (network analysis)
14
Search Term Recommender (Petras 2006)
Search Term Service: recommending strongly
associated terms from controlled vocabulary
15
Bradfordizing (White 1981, Mayr 2009)
Bradford Law of Scattering (Bradford 1948): idealized example for 450 articles
Nucleus/Core:
150 papers in
3 Journals
Zone 2:
150 papers in
9 Journals
Zone 3:
150 papers in
27 Journals
Ranking by Bradfordizing: sorting the core journal papers / core books on top
bradfordized list of journals in informetrics applied to monographs: publisher as sorting criterion
16
Author Centrality (Mutschke 2001, 2004)
Ranking by Author Centrality: sorting central author papers on top
17
Scenarios for combined ranking services
iterative use : simultanous use:
Result Set
Core Journal Papers
Central Author Papers
Relevant
Papers
Result Set
Central Author Papers
Core Journal Papers
18
Combination Matrix
Combination Author Centrality Bradfordizing STR
0 - - -
1 1 - -
2 - 1 -
3 - - 1
4 1 2 -
5 1 1 -
6 2 1 -
7 2 - 1
8 - 2 1
9 2 3 1
10 3 2 1
11 2 2 1
(number = order; same number in a row = simultanous use)
19
Main Research Issue:
Contribution to retrieval quality and usability
• Precision:
– Do central authors (core journals) provide more relevant hits?
– Do highly associated cowords have any positive effects?
• Value-adding effects:
– Do central authors (core journals) provide OTHER relevant hits?
– Do coword-relationships provide OTHER relevant search terms?
• Mashup effects:
– Do combinations of the services enhance the effects?
20
Evaluation Design
• precision in existing evaluation data:
– Clef 2003-2007: 125 topics; 65,297 SOLIS documents
– KoMoHe 2007: 39 topics; 31,155 SOLIS documents
• plausibility tests:
– author centrality / journal coreness ↔ precision
– Bradfordizing ↔ author centrality
• precision tests with users (Online-Assessment-Tool)
• usability tests with users (acceptance)
21
Prototype Architecture
2,235,769 documents from
22
Motivation: non-textual approaches in DL
• Larger document sets for subject specific
searches need to be concentrated again
(compensation, structuring)
• Exploring alternative ranking approaches
which can provide insights in document spaces
and enhance retrieval
• Plausibility that the nucleus of a literature or
central authors provide an utility for users
searching large document spaces
23
Bradfordizing
Basis: Bradford Law of Scattering
Approach: Usage of the document distributions
(scattering) in scientific journal and monograph
publications.
Core journals on research topics -> bibliometric
approach
• Identification of „core journals“ and core
publishers
• ISSN and ISBN as identifiers
24
Author Centrality
Basis: Graph theory, network analysis
Approach: Usage of the interaction
(communication) pattern -> coauthorship relations
in a research community
• Identification of „experts“
• Identification of networked, „central“ persons
• Different centrality measures
25
Results
26
Results: Bradfordizing
Bradford distributions appear in all subject domains and also
for queries in databases. It follows that Bradfordizing can be
used for re-sorting results, generally for topic specific queries
in bibliographic databases.
topic core core-j z2 z2-j z3 z3-j
1 45 3 46 11 52 42
2 85 6 88 20 87 63
3 72 9 66 19 72 55
4 99 6 97 15 92 61
5 66 2 71 14 65 50
73,4 5,2 73,6 15,8 74 54,2
Example:
Articles and journals
in the three zones
(core, z2, z3) for 5
topics
27
CLEF topics 2006
1
10
100
1000
1 10 100 1000
top152
top155
top156
top158
top160
top163
top164
top165
top171
top175
top174
top173
top172
top170
top169
top168
top167
top166
top162
top161
top159
top157
top154
top153
top151
SOLIS database (German literature)
core z2 z3
journals
articles
28
Results: Bradfordizing
 Results from qualitative interviews with information
professionals
1. Spontaneous naming of core journal can be difficult
- no naming of core journals for 50% of the topics
- high plausibility of the bradfordized journals
2. Majority attest a positive relevance effect for core journals
 Highest value-added can be expected for novice researchers,
students in a scientific field
 Perhaps the zone 3 (periphery, long tail) is most valuable for
seniors in a field
29
Results: Bradfordizing
CLEF
article
Verbesserung
core zu Z3 in
%
Verbesserung
core zu Z2 in
%
Verbesserung
Z2 zu Z3 in
%
Verbesserung
core zu base-
line in %
2003 86,56 (*) 34,57 (*) 38,63 (*) 32,65 (*)
2004 69,23 (*) 22,45 38,20 26,25 (*)
2005 78,03 (*) 29,05 (*) 37,95 (*) 29,52 (*)
2006 17,63 7,66 9,27 8,46
2007 28,18 (*) 8,31 18,35 11,77
55,93 (*) 20,41 (*) 28,48 (*) 21,73 (*)
KoMoHe
article
Verbesserung
core zu Z3 in
%
Verbesserung
core zu Z2 in
%
Verbesserung
Z2 zu Z3 in
%
Verbesserung
core zu base-
line in %
Test1 18,82 11,75 6,32 9,84
Test2 11,58 6,16 5,11 6,12
Test3 19,32 (*) 8,67 (*) 9,80 (*) 9,00 (*)
16,57 (*) 8,86 7,08 (*) 8,32 (*)
Vergleich der durchschnittl.
Precision zwischen den Zonen
ergibt:
• Core relevanter als Zone 2 und
Zone 3
• Zone 2 relevanter als Zone 3
• Meist signifikante
Verbesserungen
(t-Test, Wilcoxon)
• niedrigere Verbesserungen
bei KoMoHe
• kontinuierliche Verschiebung
der Relevanz
 relevance related
distributions* Significant based on the Wilcoxon signed-rank test and the paired t-Test
30
Results: Author centrality
31
Heuristische Evaluation des Ex-Post-Ranking-Modells
 Nutzer-evaluierte Anwendungen (Jugendinstitut 1997, ASI-
Tagung 2003)
Query
Ergebnismenge
sortiert nach
PY IDF ACL
Information.
Mehrwert
ACL
Jugend – Gewalt 0.25 0.60 0.55 92
Rechtsextremismus – Ostdeutschland 0.35 0.45 0.60 122
Frau – Personalpolitik 0.35 0.60 0.65 100
Widerstand – Drittes Reich 0.40 0.65 0.95 138
Zwangsarbeit – II. Weltkrieg 0.55 0.65 0.70 92
Eliten – BRD 0.40 0.70 0.85 107
Armut – Stadt 0.30 0.35 0.55 157
Arbeiterbewegung – 19./20. Jahrh. 0.55 0.55 0.90 164
Wertewandel – Jugend 0.40 0.50 0.30 50
Terrorismus - Demokratie 0.20 0.35 0.60 129
Durchschnitt 0.38 0.54 0.67 115
PY = Erscheinungsjahr, IDF = Inverse Dokumenthäufigkeit , ACL = Autor-Closeness
Retrievaltest
Qualitative
Evaluationen
Rankings nach Autorenzentralität:
 höhere Precision als traditionelle Rankings [?]
 hoher informationeller Mehrwert (andere Dokumente) [?]
Ergebnisse
(Hypothesen)
32
Evaluation of Author Centrality on CLEF Data
• moderate positive relationship between rate of networking and
precision
• precision of TF-IDF rankings (0.60) significantly higher than author
centrality based rankings (0.31) – BUT:
• very little overlap of documents on top of the ranking lists: 90% of
relevant hits provided by author centrality did not appear on top of
TF-IDF rankings
→ added precision of 28%
• author centrality seems to favor OTHER
relevant documents than traditional rankings
• value-adding effect:
other view to the information space
33
Zentrale Akteure im Umfeld eines Autors (Expertensuche)
Ego = Ulrich Teichler
Schlagwort = Hochschule oder Studium
Ranking der Ko-Akteure nach
Zentralität
34
Zentrale Autoren in Dokumentkollektion (Beispiel: Rechtsextremismus)
Schlagwort = Rechtsextremismus...
oder Rechtsextremismus oder Antisemitismus oder
Rassismus oder Ausländerfeindlichkeit oder
Ethnozentrismus oder Faschismus oder Neofaschismus
ab 1996
2833
SOLIS/FORIS-
Nachweise
1851 vernetzte
Akteure
(65 Giant)
35
Conclusion
• Methods are non-textual
• Scientometric/bibliometric approach
• Network analysis
• The methods can successfully be applied
(holds true in different domains, databases and
document types)
• A value-added can be demonstrated
(significant precision improvement)
• Users are intuitively and empirically satisfied
• High plausibility of the methods
36
Demo
37
Demo
38
Demo
Link zum Prototyp
http://multiweb.gesis.org/GrailsSTR/testSTR/index
39
Dr. Philipp Mayr
F14, Raum 39b
(06151) 16-9394
mailto:philipp.mayr@h-da.de
oder
mailto:philipp.mayr@gesis.org
http://www.gesis.org/index.php?id=2479

Weitere ähnliche Inhalte

Andere mochten auch

Administración y gestión de un emprendimiento
Administración y gestión de un emprendimientoAdministración y gestión de un emprendimiento
Administración y gestión de un emprendimientoDulimar Cardozo Gomez
 
Ch 7 Capstone Case
Ch 7 Capstone CaseCh 7 Capstone Case
Ch 7 Capstone Casemcajnh
 
Ventureneer social media best practices for nonprofits-stengel
Ventureneer social media best practices for nonprofits-stengelVentureneer social media best practices for nonprofits-stengel
Ventureneer social media best practices for nonprofits-stengelGeri Stengel
 
IE Application | Express Yourself
IE Application | Express YourselfIE Application | Express Yourself
IE Application | Express YourselfTerri Nair
 
وسائل الاتصال
وسائل الاتصالوسائل الاتصال
وسائل الاتصالbaade
 
Album familiar
Album familiarAlbum familiar
Album familiaryoe-yise
 
iPatientCare Brochure - Products and Services
iPatientCare Brochure - Products and ServicesiPatientCare Brochure - Products and Services
iPatientCare Brochure - Products and ServicesJames Clark
 

Andere mochten auch (12)

Administración y gestión de un emprendimiento
Administración y gestión de un emprendimientoAdministración y gestión de un emprendimiento
Administración y gestión de un emprendimiento
 
Ch 7 Capstone Case
Ch 7 Capstone CaseCh 7 Capstone Case
Ch 7 Capstone Case
 
Ventureneer social media best practices for nonprofits-stengel
Ventureneer social media best practices for nonprofits-stengelVentureneer social media best practices for nonprofits-stengel
Ventureneer social media best practices for nonprofits-stengel
 
Tipos de ram
Tipos de ramTipos de ram
Tipos de ram
 
Cii-PwC Cloud Summit Report 2016
Cii-PwC Cloud Summit Report 2016Cii-PwC Cloud Summit Report 2016
Cii-PwC Cloud Summit Report 2016
 
IE Application | Express Yourself
IE Application | Express YourselfIE Application | Express Yourself
IE Application | Express Yourself
 
وسائل الاتصال
وسائل الاتصالوسائل الاتصال
وسائل الاتصال
 
Cloud
CloudCloud
Cloud
 
24studiudecaz
24studiudecaz24studiudecaz
24studiudecaz
 
Global Watch
Global Watch Global Watch
Global Watch
 
Album familiar
Album familiarAlbum familiar
Album familiar
 
iPatientCare Brochure - Products and Services
iPatientCare Brochure - Products and ServicesiPatientCare Brochure - Products and Services
iPatientCare Brochure - Products and Services
 

Ähnlich wie Non-textual ranking in Digital Libraries

Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluatedGESIS
 
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information SystemsBibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information SystemsGESIS
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Aravind Sesagiri Raamkumar
 
Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and WritingRec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and WritingAravind Sesagiri Raamkumar
 
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرنمحاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرنمركز البحوث الأقسام العلمية
 
02 Literature search and reviewing_1.pptx
02  Literature search and reviewing_1.pptx02  Literature search and reviewing_1.pptx
02 Literature search and reviewing_1.pptxssusere05ec21
 
A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...Aravind Sesagiri Raamkumar
 
Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...
Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...
Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...Ahmed Saleh
 
Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshopVarsha Khodiyar
 
Scientometric approaches to classification
Scientometric approaches to classificationScientometric approaches to classification
Scientometric approaches to classificationNees Jan van Eck
 
A new role for libraries in research assessments
A new role for libraries in research assessmentsA new role for libraries in research assessments
A new role for libraries in research assessmentsWouter Gerritsma
 
Scopus: a changing world of Research
Scopus: a changing world of ResearchScopus: a changing world of Research
Scopus: a changing world of ResearchCiarán Quinn
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupAnita de Waard
 
Cardiff bibliometrics repository ref_july 2014
Cardiff bibliometrics repository  ref_july 2014Cardiff bibliometrics repository  ref_july 2014
Cardiff bibliometrics repository ref_july 2014rachaelwhitfield
 
British Library
British LibraryBritish Library
British Libraryclarivate
 
Benchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesBenchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesToine Bogers
 

Ähnlich wie Non-textual ranking in Digital Libraries (20)

Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
 
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information SystemsBibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
 
لتحليل الدراسات السابقة Nails محاضرة برنامج
  لتحليل الدراسات السابقة Nails محاضرة برنامج  لتحليل الدراسات السابقة Nails محاضرة برنامج
لتحليل الدراسات السابقة Nails محاضرة برنامج
 
Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and WritingRec4LRW – Scientific Paper Recommender System for Literature Review and Writing
Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing
 
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرنمحاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
 
02 Literature search and reviewing_1.pptx
02  Literature search and reviewing_1.pptx02  Literature search and reviewing_1.pptx
02 Literature search and reviewing_1.pptx
 
A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...A task-based scientific paper recommender system for literature review and ma...
A task-based scientific paper recommender system for literature review and ma...
 
JPSPstructure2015
JPSPstructure2015JPSPstructure2015
JPSPstructure2015
 
Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...
Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...
Performance Comparison of Ad-hoc Retrieval Models over Full-text vs. Titles o...
 
Data peer review workshop
Data peer review workshopData peer review workshop
Data peer review workshop
 
Scientometric approaches to classification
Scientometric approaches to classificationScientometric approaches to classification
Scientometric approaches to classification
 
Moed henk
Moed henkMoed henk
Moed henk
 
A new role for libraries in research assessments
A new role for libraries in research assessmentsA new role for libraries in research assessments
A new role for libraries in research assessments
 
Scopus: a changing world of Research
Scopus: a changing world of ResearchScopus: a changing world of Research
Scopus: a changing world of Research
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
Cardiff bibliometrics repository ref_july 2014
Cardiff bibliometrics repository  ref_july 2014Cardiff bibliometrics repository  ref_july 2014
Cardiff bibliometrics repository ref_july 2014
 
British Library
British LibraryBritish Library
British Library
 
Responsible metrics in research assessment
Responsible metrics in research assessment Responsible metrics in research assessment
Responsible metrics in research assessment
 
Benchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesBenchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program Committees
 

Mehr von GESIS

10th BIR Workshop @ECIR 2020: introduction
10th  BIR Workshop @ECIR 2020: introduction10th  BIR Workshop @ECIR 2020: introduction
10th BIR Workshop @ECIR 2020: introductionGESIS
 
From closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsFrom closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsGESIS
 
Highly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeHighly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeGESIS
 
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...GESIS
 
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsBibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsGESIS
 
Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”GESIS
 
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...GESIS
 
Searching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesSearching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesGESIS
 
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenBedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenGESIS
 
Contextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabContextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabGESIS
 
41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)GESIS
 
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...GESIS
 
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...GESIS
 
Challenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesChallenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesGESIS
 
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...GESIS
 
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...GESIS
 
Recent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalRecent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalGESIS
 
Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...GESIS
 
Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016GESIS
 
Recent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization SystemsRecent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization SystemsGESIS
 

Mehr von GESIS (20)

10th BIR Workshop @ECIR 2020: introduction
10th  BIR Workshop @ECIR 2020: introduction10th  BIR Workshop @ECIR 2020: introduction
10th BIR Workshop @ECIR 2020: introduction
 
From closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journalsFrom closed to open access: A case study of flipped journals
From closed to open access: A case study of flipped journals
 
Highly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over timeHighly cited references in PLOS ONE and their in-text usage over time
Highly cited references in PLOS ONE and their in-text usage over time
 
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural...
 
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with BibliometricsBibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
Bibliometric-enhanced Information Retrieval: Connecting IR with Bibliometrics
 
Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”Analyzing the network structure and gender differences of the “NKOS community”
Analyzing the network structure and gender differences of the “NKOS community”
 
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
Recent advances in the project EXCITE – Extraction of Citations from PDF Docu...
 
Searching beyond datasets in the Social Sciences
Searching beyond datasets in the Social SciencesSearching beyond datasets in the Social Sciences
Searching beyond datasets in the Social Sciences
 
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der SozialwissenschaftenBedeutung von Text Mining am Beispiel der Sozialwissenschaften
Bedeutung von Text Mining am Beispiel der Sozialwissenschaften
 
Contextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living LabContextualised Browsing in a Digital Library’s Living Lab
Contextualised Browsing in a Digital Library’s Living Lab
 
41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)41st European Conference on Information Retrieval (ECIR 2019)
41st European Conference on Information Retrieval (ECIR 2019)
 
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
Offenes kollaboratives Schreiben: Eine „Open Science“-Infrastruktur am Beispi...
 
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
A Complete Year of User Retrieval Sessions in a Social Sciences Academic Sear...
 
Challenges in Extracting and Managing References
Challenges in Extracting and Managing ReferencesChallenges in Extracting and Managing References
Challenges in Extracting and Managing References
 
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
Opening Scholarly Communication in Social Sciences by Connecting Collaborativ...
 
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...Measuring the usefulness of Knowledge Organization Systems in Information Ret...
Measuring the usefulness of Knowledge Organization Systems in Information Ret...
 
Recent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information RetrievalRecent Advances in Bibliometric-Enhanced Information Retrieval
Recent Advances in Bibliometric-Enhanced Information Retrieval
 
Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...Analyzing the research output presented at European Networked Knowledge Organ...
Analyzing the research output presented at European Networked Knowledge Organ...
 
Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016Introduction to the 15th NKOS workshop @TPDL2016
Introduction to the 15th NKOS workshop @TPDL2016
 
Recent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization SystemsRecent applications of Knowledge Organization Systems
Recent applications of Knowledge Organization Systems
 

Kürzlich hochgeladen

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 

Kürzlich hochgeladen (20)

Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 

Non-textual ranking in Digital Libraries

  • 1. 1 Non-textual ranking in digital libraries Philipp Mayr Hochschule Darmstadt Jour fixe ISE am 18.11.2009 Hochschule Darmstadt Fachbereich Media Slides in cooperation with Peter Mutschke & Philipp Schaer
  • 2. 2 Agenda • Introduction • Ranking in DL • IRM project • Non-textual ranking in IRM • Results • Conclusion & Demo
  • 3. 3 Background I Database perspective: • Large and heterogeneous document sets for subject specific questions • Various relevant and accessible databases for a topic • Focus on bibliographic databases (metadata) • journal articles • monographs
  • 4. 4 Background II User perspective: • Relevant & qualitative documents (relevance ranking) • Comprehensive search: documents from other fields • Flexible search systems: alternative search strategies and techniques (e.g. Berrypicking) • Value-added, e.g. direct access to fulltexts or metrics like citation counts in Google Scholar
  • 5. 5 Digital Libraries • Indexing & Abstracting databases • Library catalogues • Full texts • Links to online resources • Data • Digital objects • …
  • 6. 6 Ranking Models: • Exact match vs. best match (e.g. tf-idf) • Sorting vs. ranking Textual vs. non-textual ranking
  • 7. 7 Ranking: non-textual Link analysis (PageRank, HITS) Relevance feedback (user feedback) Popularity (documents accessed)
  • 11. 11 Non-textual ranking in DL • Link analysis (PageRank, HITS) • Relevance feedback (user feedback) • Popularity (documents accessed) BUT we have: • High quality metadata • Controlled vocabularies • Maintained (curated) collections are problematic in DL
  • 13. 13 Value Added Services for IR Systems Major problem areas of scholarly IR systems (Krause 2007): 1. search term vagueness 2. information overload by large result sets IRM services → structural attributes of the science system : • (1) Search Term Recommender: more appropriate terms from controlled vocabulary (co-word analysis) • (2a) Bradfordizing: re-ranking by core journals (bibliometrics) • (2b) Author Centrality: re-ranking by centrality in co-authorship networks (network analysis)
  • 14. 14 Search Term Recommender (Petras 2006) Search Term Service: recommending strongly associated terms from controlled vocabulary
  • 15. 15 Bradfordizing (White 1981, Mayr 2009) Bradford Law of Scattering (Bradford 1948): idealized example for 450 articles Nucleus/Core: 150 papers in 3 Journals Zone 2: 150 papers in 9 Journals Zone 3: 150 papers in 27 Journals Ranking by Bradfordizing: sorting the core journal papers / core books on top bradfordized list of journals in informetrics applied to monographs: publisher as sorting criterion
  • 16. 16 Author Centrality (Mutschke 2001, 2004) Ranking by Author Centrality: sorting central author papers on top
  • 17. 17 Scenarios for combined ranking services iterative use : simultanous use: Result Set Core Journal Papers Central Author Papers Relevant Papers Result Set Central Author Papers Core Journal Papers
  • 18. 18 Combination Matrix Combination Author Centrality Bradfordizing STR 0 - - - 1 1 - - 2 - 1 - 3 - - 1 4 1 2 - 5 1 1 - 6 2 1 - 7 2 - 1 8 - 2 1 9 2 3 1 10 3 2 1 11 2 2 1 (number = order; same number in a row = simultanous use)
  • 19. 19 Main Research Issue: Contribution to retrieval quality and usability • Precision: – Do central authors (core journals) provide more relevant hits? – Do highly associated cowords have any positive effects? • Value-adding effects: – Do central authors (core journals) provide OTHER relevant hits? – Do coword-relationships provide OTHER relevant search terms? • Mashup effects: – Do combinations of the services enhance the effects?
  • 20. 20 Evaluation Design • precision in existing evaluation data: – Clef 2003-2007: 125 topics; 65,297 SOLIS documents – KoMoHe 2007: 39 topics; 31,155 SOLIS documents • plausibility tests: – author centrality / journal coreness ↔ precision – Bradfordizing ↔ author centrality • precision tests with users (Online-Assessment-Tool) • usability tests with users (acceptance)
  • 22. 22 Motivation: non-textual approaches in DL • Larger document sets for subject specific searches need to be concentrated again (compensation, structuring) • Exploring alternative ranking approaches which can provide insights in document spaces and enhance retrieval • Plausibility that the nucleus of a literature or central authors provide an utility for users searching large document spaces
  • 23. 23 Bradfordizing Basis: Bradford Law of Scattering Approach: Usage of the document distributions (scattering) in scientific journal and monograph publications. Core journals on research topics -> bibliometric approach • Identification of „core journals“ and core publishers • ISSN and ISBN as identifiers
  • 24. 24 Author Centrality Basis: Graph theory, network analysis Approach: Usage of the interaction (communication) pattern -> coauthorship relations in a research community • Identification of „experts“ • Identification of networked, „central“ persons • Different centrality measures
  • 26. 26 Results: Bradfordizing Bradford distributions appear in all subject domains and also for queries in databases. It follows that Bradfordizing can be used for re-sorting results, generally for topic specific queries in bibliographic databases. topic core core-j z2 z2-j z3 z3-j 1 45 3 46 11 52 42 2 85 6 88 20 87 63 3 72 9 66 19 72 55 4 99 6 97 15 92 61 5 66 2 71 14 65 50 73,4 5,2 73,6 15,8 74 54,2 Example: Articles and journals in the three zones (core, z2, z3) for 5 topics
  • 27. 27 CLEF topics 2006 1 10 100 1000 1 10 100 1000 top152 top155 top156 top158 top160 top163 top164 top165 top171 top175 top174 top173 top172 top170 top169 top168 top167 top166 top162 top161 top159 top157 top154 top153 top151 SOLIS database (German literature) core z2 z3 journals articles
  • 28. 28 Results: Bradfordizing  Results from qualitative interviews with information professionals 1. Spontaneous naming of core journal can be difficult - no naming of core journals for 50% of the topics - high plausibility of the bradfordized journals 2. Majority attest a positive relevance effect for core journals  Highest value-added can be expected for novice researchers, students in a scientific field  Perhaps the zone 3 (periphery, long tail) is most valuable for seniors in a field
  • 29. 29 Results: Bradfordizing CLEF article Verbesserung core zu Z3 in % Verbesserung core zu Z2 in % Verbesserung Z2 zu Z3 in % Verbesserung core zu base- line in % 2003 86,56 (*) 34,57 (*) 38,63 (*) 32,65 (*) 2004 69,23 (*) 22,45 38,20 26,25 (*) 2005 78,03 (*) 29,05 (*) 37,95 (*) 29,52 (*) 2006 17,63 7,66 9,27 8,46 2007 28,18 (*) 8,31 18,35 11,77 55,93 (*) 20,41 (*) 28,48 (*) 21,73 (*) KoMoHe article Verbesserung core zu Z3 in % Verbesserung core zu Z2 in % Verbesserung Z2 zu Z3 in % Verbesserung core zu base- line in % Test1 18,82 11,75 6,32 9,84 Test2 11,58 6,16 5,11 6,12 Test3 19,32 (*) 8,67 (*) 9,80 (*) 9,00 (*) 16,57 (*) 8,86 7,08 (*) 8,32 (*) Vergleich der durchschnittl. Precision zwischen den Zonen ergibt: • Core relevanter als Zone 2 und Zone 3 • Zone 2 relevanter als Zone 3 • Meist signifikante Verbesserungen (t-Test, Wilcoxon) • niedrigere Verbesserungen bei KoMoHe • kontinuierliche Verschiebung der Relevanz  relevance related distributions* Significant based on the Wilcoxon signed-rank test and the paired t-Test
  • 31. 31 Heuristische Evaluation des Ex-Post-Ranking-Modells  Nutzer-evaluierte Anwendungen (Jugendinstitut 1997, ASI- Tagung 2003) Query Ergebnismenge sortiert nach PY IDF ACL Information. Mehrwert ACL Jugend – Gewalt 0.25 0.60 0.55 92 Rechtsextremismus – Ostdeutschland 0.35 0.45 0.60 122 Frau – Personalpolitik 0.35 0.60 0.65 100 Widerstand – Drittes Reich 0.40 0.65 0.95 138 Zwangsarbeit – II. Weltkrieg 0.55 0.65 0.70 92 Eliten – BRD 0.40 0.70 0.85 107 Armut – Stadt 0.30 0.35 0.55 157 Arbeiterbewegung – 19./20. Jahrh. 0.55 0.55 0.90 164 Wertewandel – Jugend 0.40 0.50 0.30 50 Terrorismus - Demokratie 0.20 0.35 0.60 129 Durchschnitt 0.38 0.54 0.67 115 PY = Erscheinungsjahr, IDF = Inverse Dokumenthäufigkeit , ACL = Autor-Closeness Retrievaltest Qualitative Evaluationen Rankings nach Autorenzentralität:  höhere Precision als traditionelle Rankings [?]  hoher informationeller Mehrwert (andere Dokumente) [?] Ergebnisse (Hypothesen)
  • 32. 32 Evaluation of Author Centrality on CLEF Data • moderate positive relationship between rate of networking and precision • precision of TF-IDF rankings (0.60) significantly higher than author centrality based rankings (0.31) – BUT: • very little overlap of documents on top of the ranking lists: 90% of relevant hits provided by author centrality did not appear on top of TF-IDF rankings → added precision of 28% • author centrality seems to favor OTHER relevant documents than traditional rankings • value-adding effect: other view to the information space
  • 33. 33 Zentrale Akteure im Umfeld eines Autors (Expertensuche) Ego = Ulrich Teichler Schlagwort = Hochschule oder Studium Ranking der Ko-Akteure nach Zentralität
  • 34. 34 Zentrale Autoren in Dokumentkollektion (Beispiel: Rechtsextremismus) Schlagwort = Rechtsextremismus... oder Rechtsextremismus oder Antisemitismus oder Rassismus oder Ausländerfeindlichkeit oder Ethnozentrismus oder Faschismus oder Neofaschismus ab 1996 2833 SOLIS/FORIS- Nachweise 1851 vernetzte Akteure (65 Giant)
  • 35. 35 Conclusion • Methods are non-textual • Scientometric/bibliometric approach • Network analysis • The methods can successfully be applied (holds true in different domains, databases and document types) • A value-added can be demonstrated (significant precision improvement) • Users are intuitively and empirically satisfied • High plausibility of the methods
  • 39. 39 Dr. Philipp Mayr F14, Raum 39b (06151) 16-9394 mailto:philipp.mayr@h-da.de oder mailto:philipp.mayr@gesis.org http://www.gesis.org/index.php?id=2479