SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
On the Use of
     Relevance Feedback in
   IR-Based Concept Location


Gregory Gay*, Sonia Haiduc**, Andrian Marcus**, Tim
                     Menzies*

   * West Virginia University, Morgantown, WV, USA
      ** Wayne State University, Detroit, MI, USA
Software change
IR-based concept location
      Query




Ranked list of results   Source code
Challenge: the query




• Text in the query needs to match the text in the
  source code
• Difficult to formulate good queries
  - unfamiliar source code
  - unknown target
  -> hard to describe something that you do not know
Eclipse bug #13926

Bug description:
 JFace Text Editor Leaves a Black
 Rectangle on Content Assist text insertion.
 Inserting a selected completion proposal
 from the context information popup causes
 a black rectangle to appear on top of the
 display.
Queries
• Q1: jface text editor black rectangle insert text

• Q2: jface text editor rectangle insert context
  information

• Q3: jface text editor content assist

• Q4: jface insert selected completion proposal
  context information
Queries and results
• Q1: jface text editor black rectangle insert text
  – position of modified method: 7496
• Q2: jface text editor rectangle insert context
  information
  – position of modified method: 258
• Q3: jface text editor content assist
  – position of modified method: 119
• Q4: jface insert selected completion proposal
  context information
  – position of modified method: 723

  Whole change request: 54
IR CL in unfamiliar software

Developers:
• Rarely begin with a good query: hard to choose
  the right words
• Analyze very briefly list of results before
  reformulating query
• Even after reformulation, vague idea of what to
  look for -> queries not always better
• Can recognize whether the results retrieved are
  relevant or not to the problem
Questions

• Is there a way to make the query formulation
  easier on the developers?

• Is there a way to ensure that the subsequent
  queries lead us in the right direction?

• Can we do this by following the common
  practices of the developers?

• Can we improve IR-based CL using this
  approach?
Relevance feedback

•   Uses developer feedback about relevancy of
    returned results to automatically reformulate
    queries
•   Queries are reformulated by:
    –   Adding terms from relevant documents
    –   Removing terms from irrelevant documents
•   Iterative process
•   Common technique in text retrieval
•   Used also in SE
JFace Text Editor Leaves a Black Rectangle on Content Assist text
   insertion. Inserting a selected completion proposal from the
   context information popup causes a black rectangle to appear on
   top of the display.

1. createContextInfoPopup() in
   org.eclipse.jface.text.contentassist.ContextInformationPopup
2. configure() in
   org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference
3. showContextProposals() in
   org.eclipse.jface.text.contentassist.ContextInformationPopup


     + words in      documents          - words in      documents


                               New Query
IRRF tool
• IR Engine: Lucene
  – based on the Vector Space Model (VSM)
  – input: methods, query
  – output: a ranked list of methods ordered by their
    textual similarity to the query

• Relevance feedback: Rocchio algorithm
  – the classic algorithm for RF; used also in SE
  – models a way of incorporating relevance
    feedback information into the VSM
Evaluation

• Extracted bug descriptions and set of methods
  modified in the bug fixes from bug tracking
  systems
• Consider bug descriptions as initial queries for IR
• Measure #methods investigated until reaching a
  modified method before and after using RF
• Relevance feedback:
  – one developer provides feedback
  – feedback round ends after marking N methods as
    relevant or irrelevant (N = 1, 3 ,5)
Stop criteria

• Target method in top N results

• More than 50 methods analyzed

• Position of target methods in the ranked list
  of results increases for 2 consecutive rounds
  -> query moving away from wanted methods
Systems


 System     Vers.   LOC    Methods      Classes
 Eclipse     2.0 2,500,000 74,996        7,500
  jEdit     4.2      300,000   5,366     750
Adempiere   3.1.0    330,000   28,622   1,900
Results

 System     RF improves   RF does not
                 IR       improve IR
 Eclipse          6            1

  jEdit         3             3

Adempiere       4             1

   All          13            5
Results
• Eclipse:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
  19686        428        453 (5r)   48 (16r)    46 m(9r)


• jEdit:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
 1607211       354        103(5r)     36 (12r)    28 (6r)


• Adempiere:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
 1628050        52         3 (3r)      5 (2r)      7 (2r)
Questions – revisited (1)
• Is there a way to make the query formulation
  easier on the developers?
  – automatic query formulation


• Is there a way to ensure that the subsequent
  queries lead us in the right direction?
  – add terms from relevant documents, remove terms
    from irrelevant documents
  – stop when we move away from the target (results
    worsen for 2 consecutive rounds)
Questions – revisited (2)
• Can we do this by following the common
  practices of the developers?
  – developers still analyze only a few results in
    the result list before reformulation


• Can we improve IR-based CL using
  relevance feedback?
  – in some cases yes
Current and future work
• Studies involving more systems and more
  developers

• Automatically calibrating the parameters
  for a specific system and a specific set of
  change requests

• Study the circumstances when RF does
  not improve IR

Weitere ähnliche Inhalte

Was ist angesagt?

Softwaretestingstrategies
SoftwaretestingstrategiesSoftwaretestingstrategies
Softwaretestingstrategiessaieswar19
 
SYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionSYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionZain Abid
 
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...ijaia
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Praveen Penumathsa
 
Fundamentals of Software Engineering
Fundamentals of Software Engineering Fundamentals of Software Engineering
Fundamentals of Software Engineering Madhar Khan Pathan
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Twitter Inc.
 
Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)SungdoGu
 
Dynamic analysis in Software Testing
Dynamic analysis in Software TestingDynamic analysis in Software Testing
Dynamic analysis in Software TestingSagar Pednekar
 
Mca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsMca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsAman Adhikari
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...CS, NcState
 
Introduction To Algorithms
Introduction To AlgorithmsIntroduction To Algorithms
Introduction To AlgorithmsKM Bappi
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introductionSanthi Priyan
 
Unit 5 testing -software quality assurance
Unit 5  testing -software quality assuranceUnit 5  testing -software quality assurance
Unit 5 testing -software quality assurancegopal10scs185
 

Was ist angesagt? (19)

Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Softwaretestingstrategies
SoftwaretestingstrategiesSoftwaretestingstrategies
Softwaretestingstrategies
 
SYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionSYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler Construction
 
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
 
Cyclomatic complexity
Cyclomatic complexityCyclomatic complexity
Cyclomatic complexity
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Formal Methods
Formal MethodsFormal Methods
Formal Methods
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
DISE - Programming Concepts
DISE - Programming ConceptsDISE - Programming Concepts
DISE - Programming Concepts
 
Fundamentals of Software Engineering
Fundamentals of Software Engineering Fundamentals of Software Engineering
Fundamentals of Software Engineering
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
 
Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)
 
Dynamic analysis in Software Testing
Dynamic analysis in Software TestingDynamic analysis in Software Testing
Dynamic analysis in Software Testing
 
Mca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsMca se chapter_9_formal_methods
Mca se chapter_9_formal_methods
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
 
Introduction To Algorithms
Introduction To AlgorithmsIntroduction To Algorithms
Introduction To Algorithms
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introduction
 
Testing ppt
Testing pptTesting ppt
Testing ppt
 
Unit 5 testing -software quality assurance
Unit 5  testing -software quality assuranceUnit 5  testing -software quality assurance
Unit 5 testing -software quality assurance
 

Andere mochten auch

Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)silambu111
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challengeGan Keng Hoon
 
Information Retrieval Challenges
Information Retrieval ChallengesInformation Retrieval Challenges
Information Retrieval ChallengesBruno Pedro
 
Data Mining and Information Retrival
Data Mining and Information RetrivalData Mining and Information Retrival
Data Mining and Information Retrivalketan shete
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithmRupali Bhatnagar
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!Jane Garay
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 

Andere mochten auch (7)

Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challenge
 
Information Retrieval Challenges
Information Retrieval ChallengesInformation Retrieval Challenges
Information Retrieval Challenges
 
Data Mining and Information Retrival
Data Mining and Information RetrivalData Mining and Information Retrival
Data Mining and Information Retrival
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 

Ähnlich wie Concept Location using Information Retrieval and Relevance Feedback

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationSonia Haiduc
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsAravind Sesagiri Raamkumar
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search EngineAyan Chandra
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1arthi v
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenPoo Kuan Hoong
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Harald Steck
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Developmentnikhil sreeni
 
IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12BakerTilly US
 
Managing software project, software engineering
Managing software project, software engineeringManaging software project, software engineering
Managing software project, software engineeringRupesh Vaishnav
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptxubaidullah75790
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionseXascale Infolab
 

Ähnlich wie Concept Location using Information Retrieval and Relevance Feedback (20)

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
 
Apsec 2014 Presentation
Apsec 2014 PresentationApsec 2014 Presentation
Apsec 2014 Presentation
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
Query processing
Query processingQuery processing
Query processing
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search Engine
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
 
IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12
 
Managing software project, software engineering
Managing software project, software engineeringManaging software project, software engineering
Managing software project, software engineering
 
Development Guideline
Development GuidelineDevelopment Guideline
Development Guideline
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptx
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web Collections
 

Kürzlich hochgeladen

ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 

Kürzlich hochgeladen (20)

ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 

Concept Location using Information Retrieval and Relevance Feedback

  • 1. On the Use of Relevance Feedback in IR-Based Concept Location Gregory Gay*, Sonia Haiduc**, Andrian Marcus**, Tim Menzies* * West Virginia University, Morgantown, WV, USA ** Wayne State University, Detroit, MI, USA
  • 3. IR-based concept location Query Ranked list of results Source code
  • 4. Challenge: the query • Text in the query needs to match the text in the source code • Difficult to formulate good queries - unfamiliar source code - unknown target -> hard to describe something that you do not know
  • 5. Eclipse bug #13926 Bug description: JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display.
  • 6. Queries • Q1: jface text editor black rectangle insert text • Q2: jface text editor rectangle insert context information • Q3: jface text editor content assist • Q4: jface insert selected completion proposal context information
  • 7. Queries and results • Q1: jface text editor black rectangle insert text – position of modified method: 7496 • Q2: jface text editor rectangle insert context information – position of modified method: 258 • Q3: jface text editor content assist – position of modified method: 119 • Q4: jface insert selected completion proposal context information – position of modified method: 723 Whole change request: 54
  • 8. IR CL in unfamiliar software Developers: • Rarely begin with a good query: hard to choose the right words • Analyze very briefly list of results before reformulating query • Even after reformulation, vague idea of what to look for -> queries not always better • Can recognize whether the results retrieved are relevant or not to the problem
  • 9. Questions • Is there a way to make the query formulation easier on the developers? • Is there a way to ensure that the subsequent queries lead us in the right direction? • Can we do this by following the common practices of the developers? • Can we improve IR-based CL using this approach?
  • 10. Relevance feedback • Uses developer feedback about relevancy of returned results to automatically reformulate queries • Queries are reformulated by: – Adding terms from relevant documents – Removing terms from irrelevant documents • Iterative process • Common technique in text retrieval • Used also in SE
  • 11. JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display. 1. createContextInfoPopup() in org.eclipse.jface.text.contentassist.ContextInformationPopup 2. configure() in org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference 3. showContextProposals() in org.eclipse.jface.text.contentassist.ContextInformationPopup + words in documents - words in documents New Query
  • 12. IRRF tool • IR Engine: Lucene – based on the Vector Space Model (VSM) – input: methods, query – output: a ranked list of methods ordered by their textual similarity to the query • Relevance feedback: Rocchio algorithm – the classic algorithm for RF; used also in SE – models a way of incorporating relevance feedback information into the VSM
  • 13. Evaluation • Extracted bug descriptions and set of methods modified in the bug fixes from bug tracking systems • Consider bug descriptions as initial queries for IR • Measure #methods investigated until reaching a modified method before and after using RF • Relevance feedback: – one developer provides feedback – feedback round ends after marking N methods as relevant or irrelevant (N = 1, 3 ,5)
  • 14. Stop criteria • Target method in top N results • More than 50 methods analyzed • Position of target methods in the ranked list of results increases for 2 consecutive rounds -> query moving away from wanted methods
  • 15. Systems System Vers. LOC Methods Classes Eclipse 2.0 2,500,000 74,996 7,500 jEdit 4.2 300,000 5,366 750 Adempiere 3.1.0 330,000 28,622 1,900
  • 16. Results System RF improves RF does not IR improve IR Eclipse 6 1 jEdit 3 3 Adempiere 4 1 All 13 5
  • 17. Results • Eclipse: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 19686 428 453 (5r) 48 (16r) 46 m(9r) • jEdit: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 1607211 354 103(5r) 36 (12r) 28 (6r) • Adempiere: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 1628050 52 3 (3r) 5 (2r) 7 (2r)
  • 18. Questions – revisited (1) • Is there a way to make the query formulation easier on the developers? – automatic query formulation • Is there a way to ensure that the subsequent queries lead us in the right direction? – add terms from relevant documents, remove terms from irrelevant documents – stop when we move away from the target (results worsen for 2 consecutive rounds)
  • 19. Questions – revisited (2) • Can we do this by following the common practices of the developers? – developers still analyze only a few results in the result list before reformulation • Can we improve IR-based CL using relevance feedback? – in some cases yes
  • 20. Current and future work • Studies involving more systems and more developers • Automatically calibrating the parameters for a specific system and a specific set of change requests • Study the circumstances when RF does not improve IR