SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Benchmarking Domain-Specific Expert Search
Using Workshop Program Committees
Georgeta Bordea1, Toine Bogers2 & Paul Buitelaar1
1 Digital

Enterprise Research Institute
National University of Ireland

2

Royal School of Library & Information Science
University of Copenhagen

CSTA workshop @ CIKM 2013
October 28, 2013
Outline
• Introduction
• Domain-specific test collections for expert search
- Information retrieval
- Semantic web
- Computational linguistics

• Benchmarking our new collections
- Expert finding
- Expert profiling

• Discussion & conclusions
2
Introduction
• Knowledge workers spend around 25% of their time searching for
information

- 99% report using other people as information sources
- 14.4% of their time is spent on this (56% depending on your definition)
- Why do people search for other people? (Hertzum & Pejtersen, 2005)
‣ Search documents to find relevant people
‣ Search people to find relevant documents

• Expert search engines support this need for people search
- Searching for people instead of documents
3
Introduction

“machine learning” “speech recognition”

4
Related work
• Historical solution (80s and 90s)
- Manually constructing a database of people’s expertise

• Automatic approaches to expert search since 2000s
- Automatically retrieve expertise evidence and associate this with experts
- Expert finding (“Who is the expert on topic X?”)
‣ Find the experts on a specific topic

- Expert profiling (“What is the expertise of person Y?”)
‣ Find out what one expert knows about different topics
5
Related work
• TREC Enterprise track (2005-2008)
- Focused on enterprise search → searching the data of an organization
- W3C collection (2005-2006)
- CSIRO collection (2007-2008)

• UvT Expert Collection (2007, updated in 2012)
- University-wide crawl of expertise evidence
‣ Publications, course descriptions, research descriptions, personal home pages

- Topics & relevance (self-)assessments from manual expertise database
6
Related work
W3C
# people
# documents
# topics

CSIRO

UvT

1,092

3,490

496

331,037

370,715

36,699

99

50

981

• Problems with these data sets
- Relevance assessments
‣ W3C → Assessment by people outside organization inaccurate and incomplete
‣ CSIRO → Assessment by co-workers biased towards social network
‣ UvT → Self-assessment by experts is subjective and incomplete

- Focus on a single organization → relatively few experts per expertise area

7
Solution: Domain-specific test collections
• Documents
- Where? Collect publications from relevant journals and conferences in a
specific domain

- Why? More challenging because of lower level of granularity

• Topics
- Where? Collect topics descriptions from conference workshop websites
- Why? Rich descriptions with explicitly identified sub-topics (“areas of interest”)

• Relevance assessments
- Where? Program committees listed on workshop websites
- Why? Combines peer judgments with self-assessment
8
Collection 1: Information retrieval (IR)
• Research domain(s)
• Research domain(s):
- digital
- Information retrieval,Inform libraries, and recommender systems

• Topics

• Topics

- Workshops with substantial portion
- Workshops held at conferences held at conferences withdedicated to these
substantial portion dedicated to
domains between 2001 and 2012
‣ CIKM

‣ IIiX

‣ SIGIR

‣ RecSys

‣ ECIR

‣ ECDL

‣ WWW

‣ JCDL

‣ WSDM

‣ TPDL
9
Collection 1: Information retrieval (IR)
• Documents
- Based on DBLP Computer Science Bibliography
‣ Good coverage of research domains
‣ ArnetMiner version available with (automatically extracted) citation information

- Selected publications from all relevant IR venues
‣ Core venues → Hosting conferences for selected IR workshops (~9,000 docs)
‣ Curated venues → Additional venues with substantial IR coverage (~16,000 docs)
‣ Venue has to have at least 5 publications in ArnetMiner DBLP data set
‣ Resulted in ~25,000 publications

- Collected full-text versions using Google Scholar for 54.1% of publications
10
Collection 2: Semantic Web (SW)
• Research domain(s)
- Semantic Web

• Topics
- Workshops held at conferences in the Semantic Web Dog Food data set
‣ ISWC

‣ WWW

‣ EKAW

‣ ASWC

‣ ESWC

‣ I-Semantics

• Documents
- Based on Semantic Web Dog Food corpus (SPARQL public endpoint)
- Full-text PDF versions available for all publications
11
Collection 3: Computational linguistics (CL)
• Research domain(s)
- Computational linguistics, natural language processing

• Topics
- Workshops held at conferences in the ACL Anthology Reference Corpus
‣ ACL

‣ SemEval

‣ CoLing

‣ NAACL

‣ ANLP

‣ HLT

‣ EACL

‣ EMNLP

‣ LREC

• Documents
- Based on ACL Anthology Reference Corpus
- Full-text PDF versions available for all publications
12
Topics & relevance assessments
• Topic representations
- Title
- Long description (complete workshop description)
- Short description (teaser description, typically first paragraph)
- Areas of interest

13
<topic id="014">
(IRiX)</title>
tle>Workshop on Information Retrieval in Context
<ti
<year>2004</year>
<website>http://ir.dcs.gla.ac.uk/context/</website>
iety of theoretical
<short_description>This workshop will explore a var
eractive IR research.</
orks, characteristics and approaches to future int
framew
short_description>
nt information
cription>There is a growing realisation that releva
<long_des
ong_description>
[...] for future interactive IR (IIR) research.</l
<areas_of_interest>
a>
<area>Contextual IR theory - modeling context</are
[...]
</areas_of_interest>
<organizers>
<name>Peter Ingwersen</name>
[...]
</organizers>
<program_committee>
<name>Pia Borlund</name>
[...]
</program_committee>
</topic>
14
Topics & relevance assessments
• Topic representations
- Title
- Long description (complete workshop description)
- Short description (teaser description, typically first paragraph)
- Areas of interest
- Manually annotated topics with fine-grained expertise topics

• Relevance assessments
- PC members and organizers typically have expertise in one or more areas of
interest → combination of peer judgments and self-assessment

- Relevance value of ‘2’ for organizers and ‘1’ for PC members
15
Collections by numbers
Information
retrieval

Semantic Web

Computational
linguistics

# (unique) authors

26,098

9,983

4,480

# documents

24,690

10,921

2,311

% full-text documents

54.1%

100%

100%

# workshops (= topics)

60

340

190

# expertise topics

488

4,660

6,751

avg. # authors/document

2.7

2.2

3.3

avg. # experts/topic

14.9

25.8

24.9
16
Benchmarking the collections
• Benchmark results on our collections using state-of-the-art
approaches on two tasks

- Profile-centric model (M1, “Model 1”) — expert finding, expert profiling
‣ Aggregate all content for an expert into a document representation and produce ranking

- Document-centric model (M2, “Model 2”) — expert finding, expert profiling
‣ Retrieve relevant documents, then associate with experts and produce ranking

- Saffron (Bordea et al., 2012)
‣ Automatically extracts expertise terms from text, ranks them by term frequency, length,
and ‘embeddedness’, associates documents and experts with these terms
‣ Topic-centric extraction (TC) — expert finding, expert profiling
‣ Document-count ranking (DC) — expert finding
17
Expert finding
Profile-centric

Document-centric

Saffron - TC

Saffron - DC

0.18
0.16
0.14
0.12
0.10

P@5 0.08
0.06
0.04
0.02
0.00
Information retrieval!

Semantic Web!

Computational linguistics!

18
Expert profiling
Profile-centric

Document-centric

Saffron - TC

0.18
0.16
0.14
0.12
0.10

MAP 0.08
0.06
0.04
0.02
0.00
Information retrieval!

Semantic Web!

Computational linguistics!

19
Discussion & conclusions
• Contributions
- Three new domain-specific test collections for expert search
‣ Available at http://itlab.dbit.dk/~toine/?page_id=631

- Workshop websites for topic creation & relevance assessment
- Benchmarked performance for expert finding and expert profiling

• Findings
- Term extraction approaches outperform language modeling on domaincentered collections (as opposed to organization-centric collections)

• Caveats
- Incomplete assessments & social selection bias for PC members?
20
Future work
• Expansion
- Add additional domains
‣ Need an active workshop scene & access to documents

- Add additional topics to existing collections
‣ IR collection has 100+ workshops that need manual cleaning
‣ Conference tutorials could also be added (but very incomplete relevance assessments!)

• Drilling down
- Incorporate social evidence in the form of citation networks
- Investigate the temporal aspect (topic drift?)
21
Questions? Comments? Suggestions?

22

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (10)

Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
Incorporating Chinese Radicals Into Neural Machine Translation: Deeper Than C...
 
What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...What papers should I cite from my reading list? User evaluation of a manuscri...
What papers should I cite from my reading list? User evaluation of a manuscri...
 
A Collaborative Document Ranking Model for a Multi-Faceted Search
A Collaborative Document Ranking Model for a Multi-Faceted SearchA Collaborative Document Ranking Model for a Multi-Faceted Search
A Collaborative Document Ranking Model for a Multi-Faceted Search
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
งานนำเสนอ3
งานนำเสนอ3งานนำเสนอ3
งานนำเสนอ3
 
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
Comparison of Techniques for Measuring Research Coverage of Scientific Papers...
 
Web Data Engineering - A Technical Perspective on Web Archives
Web Data Engineering - A Technical Perspective on Web ArchivesWeb Data Engineering - A Technical Perspective on Web Archives
Web Data Engineering - A Technical Perspective on Web Archives
 
Computer Science Library Training
Computer Science Library TrainingComputer Science Library Training
Computer Science Library Training
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...
 

Ähnlich wie Benchmarking Domain-specific Expert Search using Workshop Program Committees

Training Researchers with Sakai
Training Researchers with SakaiTraining Researchers with Sakai
Training Researchers with Sakai
Jez Cope
 
DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)
DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)
DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)
Digitised Manuscripts to Europeana
 
Improving Library Resource Discovery
Improving Library Resource DiscoveryImproving Library Resource Discovery
Improving Library Resource Discovery
Danya Leebaw
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
GESIS
 

Ähnlich wie Benchmarking Domain-specific Expert Search using Workshop Program Committees (20)

The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
 
empirical-SLR.pptx
empirical-SLR.pptxempirical-SLR.pptx
empirical-SLR.pptx
 
Designing and Implementing Search Solutions
Designing and Implementing Search SolutionsDesigning and Implementing Search Solutions
Designing and Implementing Search Solutions
 
Take control of your PhD journey: Manage your research data according to best...
Take control of your PhD journey: Manage your research data according to best...Take control of your PhD journey: Manage your research data according to best...
Take control of your PhD journey: Manage your research data according to best...
 
Training Researchers with Sakai
Training Researchers with SakaiTraining Researchers with Sakai
Training Researchers with Sakai
 
LR.pptx
LR.pptxLR.pptx
LR.pptx
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Oak meeting 18/09/2014
Oak meeting 18/09/2014Oak meeting 18/09/2014
Oak meeting 18/09/2014
 
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic MinerAutomatic Classification of Springer Nature Proceedings with Smart Topic Miner
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
 
Scientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked DataScientific Publication Retrieval in Linked Data
Scientific Publication Retrieval in Linked Data
 
DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)
DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)
DM2E Project meeting Bergen: WP3 Report on Task 3.4 (Steffen Hennicke)
 
Improving Library Resource Discovery
Improving Library Resource DiscoveryImproving Library Resource Discovery
Improving Library Resource Discovery
 
Research Data Management at Imperial College London
Research Data Management at Imperial College LondonResearch Data Management at Imperial College London
Research Data Management at Imperial College London
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Search term recommendation and non-textual ranking evaluated
 Search term recommendation and non-textual ranking evaluated Search term recommendation and non-textual ranking evaluated
Search term recommendation and non-textual ranking evaluated
 
Designing e-Learning Objects
Designing e-Learning ObjectsDesigning e-Learning Objects
Designing e-Learning Objects
 
Dissertation literature search
Dissertation literature searchDissertation literature search
Dissertation literature search
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Clarin nl odijk-final_event_2015-03-13
Clarin nl odijk-final_event_2015-03-13Clarin nl odijk-final_event_2015-03-13
Clarin nl odijk-final_event_2015-03-13
 
"Data in Context" IG sessions @ RDA 3rd Plenary
"Data in Context" IG sessions @  RDA 3rd Plenary"Data in Context" IG sessions @  RDA 3rd Plenary
"Data in Context" IG sessions @ RDA 3rd Plenary
 

Mehr von Toine Bogers

Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingHands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Toine Bogers
 
A Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeA Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index Size
Toine Bogers
 

Mehr von Toine Bogers (16)

"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C..."If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
"If I like BLANK, what else will I like?": Analyzing a Human Recommendation C...
 
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while DrivingHands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
Hands-free but not Eyes-free: A Usability Evaluation of Siri while Driving
 
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
“Looking for an Amazing Game I Can Relax and Sink Hours into...”: A Study of ...
 
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in DenmarkA Study of Usage and Usability of Intelligent Personal Assistants in Denmark
A Study of Usage and Usability of Intelligent Personal Assistants in Denmark
 
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
“What was this movie about this chick?”: A Comparative Study of Relevance Asp...
 
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq..."I just scroll through my stuff until I find it or give up": A Contextual Inq...
"I just scroll through my stuff until I find it or give up": A Contextual Inq...
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Defining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven RecommendationDefining and Supporting Narrative-driven Recommendation
Defining and Supporting Narrative-driven Recommendation
 
An In-depth Analysis of Tags and Controlled Metadata for Book Search
An In-depth Analysis of Tags and Controlled Metadata for Book SearchAn In-depth Analysis of Tags and Controlled Metadata for Book Search
An In-depth Analysis of Tags and Controlled Metadata for Book Search
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
A Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index SizeA Longitudinal Analysis of Search Engine Index Size
A Longitudinal Analysis of Search Engine Index Size
 
Tagging vs. Controlled Vocabulary: Which is More Helpful for Book Search?
Tagging vs. Controlled Vocabulary: Which is More Helpful for Book Search?Tagging vs. Controlled Vocabulary: Which is More Helpful for Book Search?
Tagging vs. Controlled Vocabulary: Which is More Helpful for Book Search?
 
Measuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage SystemsMeasuring System Performance in Cultural Heritage Systems
Measuring System Performance in Cultural Heritage Systems
 
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
How 'Social' are Social News Sites? Exploring the Motivations for Using Reddi...
 
Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?
 
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on TwitterMicro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
Micro-Serendipity: Meaningful Coincidences in Everyday Life Shared on Twitter
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Benchmarking Domain-specific Expert Search using Workshop Program Committees

  • 1. Benchmarking Domain-Specific Expert Search Using Workshop Program Committees Georgeta Bordea1, Toine Bogers2 & Paul Buitelaar1 1 Digital Enterprise Research Institute National University of Ireland 2 Royal School of Library & Information Science University of Copenhagen CSTA workshop @ CIKM 2013 October 28, 2013
  • 2. Outline • Introduction • Domain-specific test collections for expert search - Information retrieval - Semantic web - Computational linguistics • Benchmarking our new collections - Expert finding - Expert profiling • Discussion & conclusions 2
  • 3. Introduction • Knowledge workers spend around 25% of their time searching for information - 99% report using other people as information sources - 14.4% of their time is spent on this (56% depending on your definition) - Why do people search for other people? (Hertzum & Pejtersen, 2005) ‣ Search documents to find relevant people ‣ Search people to find relevant documents • Expert search engines support this need for people search - Searching for people instead of documents 3
  • 5. Related work • Historical solution (80s and 90s) - Manually constructing a database of people’s expertise • Automatic approaches to expert search since 2000s - Automatically retrieve expertise evidence and associate this with experts - Expert finding (“Who is the expert on topic X?”) ‣ Find the experts on a specific topic - Expert profiling (“What is the expertise of person Y?”) ‣ Find out what one expert knows about different topics 5
  • 6. Related work • TREC Enterprise track (2005-2008) - Focused on enterprise search → searching the data of an organization - W3C collection (2005-2006) - CSIRO collection (2007-2008) • UvT Expert Collection (2007, updated in 2012) - University-wide crawl of expertise evidence ‣ Publications, course descriptions, research descriptions, personal home pages - Topics & relevance (self-)assessments from manual expertise database 6
  • 7. Related work W3C # people # documents # topics CSIRO UvT 1,092 3,490 496 331,037 370,715 36,699 99 50 981 • Problems with these data sets - Relevance assessments ‣ W3C → Assessment by people outside organization inaccurate and incomplete ‣ CSIRO → Assessment by co-workers biased towards social network ‣ UvT → Self-assessment by experts is subjective and incomplete - Focus on a single organization → relatively few experts per expertise area 7
  • 8. Solution: Domain-specific test collections • Documents - Where? Collect publications from relevant journals and conferences in a specific domain - Why? More challenging because of lower level of granularity • Topics - Where? Collect topics descriptions from conference workshop websites - Why? Rich descriptions with explicitly identified sub-topics (“areas of interest”) • Relevance assessments - Where? Program committees listed on workshop websites - Why? Combines peer judgments with self-assessment 8
  • 9. Collection 1: Information retrieval (IR) • Research domain(s) • Research domain(s): - digital - Information retrieval,Inform libraries, and recommender systems • Topics • Topics - Workshops with substantial portion - Workshops held at conferences held at conferences withdedicated to these substantial portion dedicated to domains between 2001 and 2012 ‣ CIKM ‣ IIiX ‣ SIGIR ‣ RecSys ‣ ECIR ‣ ECDL ‣ WWW ‣ JCDL ‣ WSDM ‣ TPDL 9
  • 10. Collection 1: Information retrieval (IR) • Documents - Based on DBLP Computer Science Bibliography ‣ Good coverage of research domains ‣ ArnetMiner version available with (automatically extracted) citation information - Selected publications from all relevant IR venues ‣ Core venues → Hosting conferences for selected IR workshops (~9,000 docs) ‣ Curated venues → Additional venues with substantial IR coverage (~16,000 docs) ‣ Venue has to have at least 5 publications in ArnetMiner DBLP data set ‣ Resulted in ~25,000 publications - Collected full-text versions using Google Scholar for 54.1% of publications 10
  • 11. Collection 2: Semantic Web (SW) • Research domain(s) - Semantic Web • Topics - Workshops held at conferences in the Semantic Web Dog Food data set ‣ ISWC ‣ WWW ‣ EKAW ‣ ASWC ‣ ESWC ‣ I-Semantics • Documents - Based on Semantic Web Dog Food corpus (SPARQL public endpoint) - Full-text PDF versions available for all publications 11
  • 12. Collection 3: Computational linguistics (CL) • Research domain(s) - Computational linguistics, natural language processing • Topics - Workshops held at conferences in the ACL Anthology Reference Corpus ‣ ACL ‣ SemEval ‣ CoLing ‣ NAACL ‣ ANLP ‣ HLT ‣ EACL ‣ EMNLP ‣ LREC • Documents - Based on ACL Anthology Reference Corpus - Full-text PDF versions available for all publications 12
  • 13. Topics & relevance assessments • Topic representations - Title - Long description (complete workshop description) - Short description (teaser description, typically first paragraph) - Areas of interest 13
  • 14. <topic id="014"> (IRiX)</title> tle>Workshop on Information Retrieval in Context <ti <year>2004</year> <website>http://ir.dcs.gla.ac.uk/context/</website> iety of theoretical <short_description>This workshop will explore a var eractive IR research.</ orks, characteristics and approaches to future int framew short_description> nt information cription>There is a growing realisation that releva <long_des ong_description> [...] for future interactive IR (IIR) research.</l <areas_of_interest> a> <area>Contextual IR theory - modeling context</are [...] </areas_of_interest> <organizers> <name>Peter Ingwersen</name> [...] </organizers> <program_committee> <name>Pia Borlund</name> [...] </program_committee> </topic> 14
  • 15. Topics & relevance assessments • Topic representations - Title - Long description (complete workshop description) - Short description (teaser description, typically first paragraph) - Areas of interest - Manually annotated topics with fine-grained expertise topics • Relevance assessments - PC members and organizers typically have expertise in one or more areas of interest → combination of peer judgments and self-assessment - Relevance value of ‘2’ for organizers and ‘1’ for PC members 15
  • 16. Collections by numbers Information retrieval Semantic Web Computational linguistics # (unique) authors 26,098 9,983 4,480 # documents 24,690 10,921 2,311 % full-text documents 54.1% 100% 100% # workshops (= topics) 60 340 190 # expertise topics 488 4,660 6,751 avg. # authors/document 2.7 2.2 3.3 avg. # experts/topic 14.9 25.8 24.9 16
  • 17. Benchmarking the collections • Benchmark results on our collections using state-of-the-art approaches on two tasks - Profile-centric model (M1, “Model 1”) — expert finding, expert profiling ‣ Aggregate all content for an expert into a document representation and produce ranking - Document-centric model (M2, “Model 2”) — expert finding, expert profiling ‣ Retrieve relevant documents, then associate with experts and produce ranking - Saffron (Bordea et al., 2012) ‣ Automatically extracts expertise terms from text, ranks them by term frequency, length, and ‘embeddedness’, associates documents and experts with these terms ‣ Topic-centric extraction (TC) — expert finding, expert profiling ‣ Document-count ranking (DC) — expert finding 17
  • 18. Expert finding Profile-centric Document-centric Saffron - TC Saffron - DC 0.18 0.16 0.14 0.12 0.10 P@5 0.08 0.06 0.04 0.02 0.00 Information retrieval! Semantic Web! Computational linguistics! 18
  • 19. Expert profiling Profile-centric Document-centric Saffron - TC 0.18 0.16 0.14 0.12 0.10 MAP 0.08 0.06 0.04 0.02 0.00 Information retrieval! Semantic Web! Computational linguistics! 19
  • 20. Discussion & conclusions • Contributions - Three new domain-specific test collections for expert search ‣ Available at http://itlab.dbit.dk/~toine/?page_id=631 - Workshop websites for topic creation & relevance assessment - Benchmarked performance for expert finding and expert profiling • Findings - Term extraction approaches outperform language modeling on domaincentered collections (as opposed to organization-centric collections) • Caveats - Incomplete assessments & social selection bias for PC members? 20
  • 21. Future work • Expansion - Add additional domains ‣ Need an active workshop scene & access to documents - Add additional topics to existing collections ‣ IR collection has 100+ workshops that need manual cleaning ‣ Conference tutorials could also be added (but very incomplete relevance assessments!) • Drilling down - Incorporate social evidence in the form of citation networks - Investigate the temporal aspect (topic drift?) 21