SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
On the Use of
     Relevance Feedback in
   IR-Based Concept Location


Gregory Gay*, Sonia Haiduc**, Andrian Marcus**, Tim
                     Menzies*

   * West Virginia University, Morgantown, WV, USA
      ** Wayne State University, Detroit, MI, USA
Software change
IR-based concept location
      Query




Ranked list of results   Source code
Challenge: the query




• Text in the query needs to match the text in the
  source code
• Difficult to formulate good queries
  - unfamiliar source code
  - unknown target
  -> hard to describe something that you do not know
Eclipse bug #13926

Bug description:
 JFace Text Editor Leaves a Black
 Rectangle on Content Assist text insertion.
 Inserting a selected completion proposal
 from the context information popup causes
 a black rectangle to appear on top of the
 display.
Queries
• Q1: jface text editor black rectangle insert text

• Q2: jface text editor rectangle insert context
  information

• Q3: jface text editor content assist

• Q4: jface insert selected completion proposal
  context information
Queries and results
• Q1: jface text editor black rectangle insert text
  – position of modified method: 7496
• Q2: jface text editor rectangle insert context
  information
  – position of modified method: 258
• Q3: jface text editor content assist
  – position of modified method: 119
• Q4: jface insert selected completion proposal
  context information
  – position of modified method: 723

  Whole change request: 54
IR CL in unfamiliar software

Developers:
• Rarely begin with a good query: hard to choose
  the right words
• Analyze very briefly list of results before
  reformulating query
• Even after reformulation, vague idea of what to
  look for -> queries not always better
• Can recognize whether the results retrieved are
  relevant or not to the problem
Questions

• Is there a way to make the query formulation
  easier on the developers?

• Is there a way to ensure that the subsequent
  queries lead us in the right direction?

• Can we do this by following the common
  practices of the developers?

• Can we improve IR-based CL using this
  approach?
Relevance feedback

•   Uses developer feedback about relevancy of
    returned results to automatically reformulate
    queries
•   Queries are reformulated by:
    –   Adding terms from relevant documents
    –   Removing terms from irrelevant documents
•   Iterative process
•   Common technique in text retrieval
•   Used also in SE
JFace Text Editor Leaves a Black Rectangle on Content Assist text
   insertion. Inserting a selected completion proposal from the
   context information popup causes a black rectangle to appear on
   top of the display.

1. createContextInfoPopup() in
   org.eclipse.jface.text.contentassist.ContextInformationPopup
2. configure() in
   org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference
3. showContextProposals() in
   org.eclipse.jface.text.contentassist.ContextInformationPopup


     + words in      documents          - words in      documents


                               New Query
IRRF tool
• IR Engine: Lucene
  – based on the Vector Space Model (VSM)
  – input: methods, query
  – output: a ranked list of methods ordered by their
    textual similarity to the query

• Relevance feedback: Rocchio algorithm
  – the classic algorithm for RF; used also in SE
  – models a way of incorporating relevance
    feedback information into the VSM
Evaluation

• Extracted bug descriptions and set of methods
  modified in the bug fixes from bug tracking
  systems
• Consider bug descriptions as initial queries for IR
• Measure #methods investigated until reaching a
  modified method before and after using RF
• Relevance feedback:
  – one developer provides feedback
  – feedback round ends after marking N methods as
    relevant or irrelevant (N = 1, 3 ,5)
Stop criteria

• Target method in top N results

• More than 50 methods analyzed

• Position of target methods in the ranked list
  of results increases for 2 consecutive rounds
  -> query moving away from wanted methods
Systems


 System     Vers.   LOC    Methods      Classes
 Eclipse     2.0 2,500,000 74,996        7,500
  jEdit     4.2      300,000   5,366     750
Adempiere   3.1.0    330,000   28,622   1,900
Results

 System     RF improves   RF does not
                 IR       improve IR
 Eclipse          6            1

  jEdit         3             3

Adempiere       4             1

   All          13            5
Results
• Eclipse:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
  19686        428        453 (5r)   48 (16r)    46 m(9r)


• jEdit:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
 1607211       354        103(5r)     36 (12r)    28 (6r)


• Adempiere:
 Report #    Baseline    IRRF N=1    IRRF N=3    IRRF N=5
 1628050        52         3 (3r)      5 (2r)      7 (2r)
Questions – revisited (1)
• Is there a way to make the query formulation
  easier on the developers?
  – automatic query formulation


• Is there a way to ensure that the subsequent
  queries lead us in the right direction?
  – add terms from relevant documents, remove terms
    from irrelevant documents
  – stop when we move away from the target (results
    worsen for 2 consecutive rounds)
Questions – revisited (2)
• Can we do this by following the common
  practices of the developers?
  – developers still analyze only a few results in
    the result list before reformulation


• Can we improve IR-based CL using
  relevance feedback?
  – in some cases yes
Current and future work
• Studies involving more systems and more
  developers

• Automatically calibrating the parameters
  for a specific system and a specific set of
  change requests

• Study the circumstances when RF does
  not improve IR

Weitere ähnliche Inhalte

Was ist angesagt?

Softwaretestingstrategies
SoftwaretestingstrategiesSoftwaretestingstrategies
Softwaretestingstrategiessaieswar19
 
SYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionSYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionZain Abid
 
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...ijaia
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Praveen Penumathsa
 
Fundamentals of Software Engineering
Fundamentals of Software Engineering Fundamentals of Software Engineering
Fundamentals of Software Engineering Madhar Khan Pathan
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Twitter Inc.
 
Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)SungdoGu
 
Dynamic analysis in Software Testing
Dynamic analysis in Software TestingDynamic analysis in Software Testing
Dynamic analysis in Software TestingSagar Pednekar
 
Mca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsMca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsAman Adhikari
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...CS, NcState
 
Introduction To Algorithms
Introduction To AlgorithmsIntroduction To Algorithms
Introduction To AlgorithmsKM Bappi
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introductionSanthi Priyan
 
Unit 5 testing -software quality assurance
Unit 5  testing -software quality assuranceUnit 5  testing -software quality assurance
Unit 5 testing -software quality assurancegopal10scs185
 

Was ist angesagt? (19)

Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Softwaretestingstrategies
SoftwaretestingstrategiesSoftwaretestingstrategies
Softwaretestingstrategies
 
SYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler ConstructionSYNTAX Directed Translation Report || Compiler Construction
SYNTAX Directed Translation Report || Compiler Construction
 
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...
 
Cyclomatic complexity
Cyclomatic complexityCyclomatic complexity
Cyclomatic complexity
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Formal Methods
Formal MethodsFormal Methods
Formal Methods
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
DISE - Programming Concepts
DISE - Programming ConceptsDISE - Programming Concepts
DISE - Programming Concepts
 
Fundamentals of Software Engineering
Fundamentals of Software Engineering Fundamentals of Software Engineering
Fundamentals of Software Engineering
 
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
Towards Privacy-Preserving Evaluation for Information Retrieval Models over I...
 
Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)Feature Selection Techniques for Software Fault Prediction (Summary)
Feature Selection Techniques for Software Fault Prediction (Summary)
 
Dynamic analysis in Software Testing
Dynamic analysis in Software TestingDynamic analysis in Software Testing
Dynamic analysis in Software Testing
 
Mca se chapter_9_formal_methods
Mca se chapter_9_formal_methodsMca se chapter_9_formal_methods
Mca se chapter_9_formal_methods
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
 
Introduction To Algorithms
Introduction To AlgorithmsIntroduction To Algorithms
Introduction To Algorithms
 
Software testing- an introduction
Software testing- an introductionSoftware testing- an introduction
Software testing- an introduction
 
Testing ppt
Testing pptTesting ppt
Testing ppt
 
Unit 5 testing -software quality assurance
Unit 5  testing -software quality assuranceUnit 5  testing -software quality assurance
Unit 5 testing -software quality assurance
 

Andere mochten auch

Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)silambu111
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challengeGan Keng Hoon
 
Information Retrieval Challenges
Information Retrieval ChallengesInformation Retrieval Challenges
Information Retrieval ChallengesBruno Pedro
 
Data Mining and Information Retrival
Data Mining and Information RetrivalData Mining and Information Retrival
Data Mining and Information Retrivalketan shete
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithmRupali Bhatnagar
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!Jane Garay
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval ssilambu111
 

Andere mochten auch (7)

Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challenge
 
Information Retrieval Challenges
Information Retrieval ChallengesInformation Retrieval Challenges
Information Retrieval Challenges
 
Data Mining and Information Retrival
Data Mining and Information RetrivalData Mining and Information Retrival
Data Mining and Information Retrival
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
Information retrieval system!
Information retrieval system!Information retrieval system!
Information retrieval system!
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 

Ähnlich wie Concept Location using Information Retrieval and Relevance Feedback

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationSonia Haiduc
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsAravind Sesagiri Raamkumar
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search EngineAyan Chandra
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1arthi v
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenPoo Kuan Hoong
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Harald Steck
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Developmentnikhil sreeni
 
IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12BakerTilly US
 
Managing software project, software engineering
Managing software project, software engineeringManaging software project, software engineering
Managing software project, software engineeringRupesh Vaishnav
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptxubaidullah75790
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionseXascale Infolab
 

Ähnlich wie Concept Location using Information Retrieval and Relevance Feedback (20)

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
 
Apsec 2014 Presentation
Apsec 2014 PresentationApsec 2014 Presentation
Apsec 2014 Presentation
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
Query processing
Query processingQuery processing
Query processing
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search Engine
 
Building largescalepredictionsystemv1
Building largescalepredictionsystemv1Building largescalepredictionsystemv1
Building largescalepredictionsystemv1
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
Gaussian Ranking by Matrix Factorization, ACM RecSys Conference 2015
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET-  	  Question-Answer Text Mining using Machine LearningIRJET-  	  Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
IRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine LearningIRJET- Question-Answer Text Mining using Machine Learning
IRJET- Question-Answer Text Mining using Machine Learning
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
 
IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12IT1204 - Software Engineering - L12
IT1204 - Software Engineering - L12
 
Managing software project, software engineering
Managing software project, software engineeringManaging software project, software engineering
Managing software project, software engineering
 
Development Guideline
Development GuidelineDevelopment Guideline
Development Guideline
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptx
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web Collections
 

Kürzlich hochgeladen

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 

Kürzlich hochgeladen (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 

Concept Location using Information Retrieval and Relevance Feedback

  • 1. On the Use of Relevance Feedback in IR-Based Concept Location Gregory Gay*, Sonia Haiduc**, Andrian Marcus**, Tim Menzies* * West Virginia University, Morgantown, WV, USA ** Wayne State University, Detroit, MI, USA
  • 3. IR-based concept location Query Ranked list of results Source code
  • 4. Challenge: the query • Text in the query needs to match the text in the source code • Difficult to formulate good queries - unfamiliar source code - unknown target -> hard to describe something that you do not know
  • 5. Eclipse bug #13926 Bug description: JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display.
  • 6. Queries • Q1: jface text editor black rectangle insert text • Q2: jface text editor rectangle insert context information • Q3: jface text editor content assist • Q4: jface insert selected completion proposal context information
  • 7. Queries and results • Q1: jface text editor black rectangle insert text – position of modified method: 7496 • Q2: jface text editor rectangle insert context information – position of modified method: 258 • Q3: jface text editor content assist – position of modified method: 119 • Q4: jface insert selected completion proposal context information – position of modified method: 723 Whole change request: 54
  • 8. IR CL in unfamiliar software Developers: • Rarely begin with a good query: hard to choose the right words • Analyze very briefly list of results before reformulating query • Even after reformulation, vague idea of what to look for -> queries not always better • Can recognize whether the results retrieved are relevant or not to the problem
  • 9. Questions • Is there a way to make the query formulation easier on the developers? • Is there a way to ensure that the subsequent queries lead us in the right direction? • Can we do this by following the common practices of the developers? • Can we improve IR-based CL using this approach?
  • 10. Relevance feedback • Uses developer feedback about relevancy of returned results to automatically reformulate queries • Queries are reformulated by: – Adding terms from relevant documents – Removing terms from irrelevant documents • Iterative process • Common technique in text retrieval • Used also in SE
  • 11. JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display. 1. createContextInfoPopup() in org.eclipse.jface.text.contentassist.ContextInformationPopup 2. configure() in org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference 3. showContextProposals() in org.eclipse.jface.text.contentassist.ContextInformationPopup + words in documents - words in documents New Query
  • 12. IRRF tool • IR Engine: Lucene – based on the Vector Space Model (VSM) – input: methods, query – output: a ranked list of methods ordered by their textual similarity to the query • Relevance feedback: Rocchio algorithm – the classic algorithm for RF; used also in SE – models a way of incorporating relevance feedback information into the VSM
  • 13. Evaluation • Extracted bug descriptions and set of methods modified in the bug fixes from bug tracking systems • Consider bug descriptions as initial queries for IR • Measure #methods investigated until reaching a modified method before and after using RF • Relevance feedback: – one developer provides feedback – feedback round ends after marking N methods as relevant or irrelevant (N = 1, 3 ,5)
  • 14. Stop criteria • Target method in top N results • More than 50 methods analyzed • Position of target methods in the ranked list of results increases for 2 consecutive rounds -> query moving away from wanted methods
  • 15. Systems System Vers. LOC Methods Classes Eclipse 2.0 2,500,000 74,996 7,500 jEdit 4.2 300,000 5,366 750 Adempiere 3.1.0 330,000 28,622 1,900
  • 16. Results System RF improves RF does not IR improve IR Eclipse 6 1 jEdit 3 3 Adempiere 4 1 All 13 5
  • 17. Results • Eclipse: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 19686 428 453 (5r) 48 (16r) 46 m(9r) • jEdit: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 1607211 354 103(5r) 36 (12r) 28 (6r) • Adempiere: Report # Baseline IRRF N=1 IRRF N=3 IRRF N=5 1628050 52 3 (3r) 5 (2r) 7 (2r)
  • 18. Questions – revisited (1) • Is there a way to make the query formulation easier on the developers? – automatic query formulation • Is there a way to ensure that the subsequent queries lead us in the right direction? – add terms from relevant documents, remove terms from irrelevant documents – stop when we move away from the target (results worsen for 2 consecutive rounds)
  • 19. Questions – revisited (2) • Can we do this by following the common practices of the developers? – developers still analyze only a few results in the result list before reformulation • Can we improve IR-based CL using relevance feedback? – in some cases yes
  • 20. Current and future work • Studies involving more systems and more developers • Automatically calibrating the parameters for a specific system and a specific set of change requests • Study the circumstances when RF does not improve IR