SlideShare ist ein Scribd-Unternehmen logo
1 von 69
Downloaden Sie, um offline zu lesen
A Bug Report Analysis and Search Tool
M.Sc. Presentation
Yguaratã Cerqueira Cavalcanti
yguarata@gmail.com
Advisor: Silvio Romero de Lemos Meira
Co-Advisor: Eduardo Santana de Almeida
Center for Informatics – Federal University of Pernambuco (UFPE)
http://www.cin.ufpe.br
Reuse in Software Engineering (RiSE)
http://www.rise.com.br
07/03/2009, Recife – Brazil
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 1 / 57
Summary
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 2 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 3 / 57
M.Sc. Context
Change management handles requests for:
new features
correction of errors
improvements
It drives the software maintenance and evolution
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 4 / 57
M.Sc. Context
Change management handles requests for:
new features
correction of errors
improvements
It drives the software maintenance and evolution
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 4 / 57
Motivation
Software maintenance and evolution are characterised by their huge
cost and slow speed of implementation
Sommerville says that it takes almost 90% of costs
Year Total costs Reference
2000 >90% Erlikh (2000)
1993 75% Eastwood (1993)
1990 >90% Moad (1990)
1990 60–70% Huff (1990)
1988 60–70% Port (1988)
1984 65–75% McKee (1984)
1981 >50% Lientz and Swanson (1981)
1979 67% Zelkowitz et al. (1979)
Table: Conducted studies about software maintenance costs (Koskinen, 2004).
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 5 / 57
Bug tracking activity
Bug reports management
Verify bug report validity
Analyze the impact of a bug report
Assign a developer
Help with development process in general
Bug reports Software artifact that describes some defect or enhancement;
Generally, bug report submitters are developers, users, or
testers
Bug trackers Bug trackers are used to manage, store and handle change
requests (also known as bug reports)
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 6 / 57
Bug tracking activity
Bug reports management
Verify bug report validity
Analyze the impact of a bug report
Assign a developer
Help with development process in general
Bug reports Software artifact that describes some defect or enhancement;
Generally, bug report submitters are developers, users, or
testers
Bug trackers Bug trackers are used to manage, store and handle change
requests (also known as bug reports)
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 6 / 57
Bug trackers advantages
Traceability (developers, releases)
Fast identification of problems
Metrics (errors per developers, to identify critical components, etc)
Comments
Project history
Examples: Mantis, Bugzilla, Trac, Jyra
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 7 / 57
A bug report example
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 8 / 57
A bug report example [2]
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 9 / 57
A bug report example [3]
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 10 / 57
A bug report example [4]
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 11 / 57
Issues coming from bug trackers
Dynamic assignment of bug reports (Anvik et al., 2006);
Change impact analysis and effort estimation of new bug reports
(Song et al., 2006);
Quality of bug report descriptions (Ko et al., 2006);
Software evolution traceability (Sandusky et al., 2004); and
Duplicate bug reports detection consists in avoiding the submission of
bug reports that describe the submitted issue (Hiew, 2006).
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 12 / 57
The bug report duplication problem
Characterized by the submission of two or more bug reports that describe
the same software issue
Overhead of rework to search and analyze bug reports
People take almost 5-15 minutes to perform search and analysis (Anvik
et al., 2005; Cavalcanti et al., 2008)
10% to 30% of a bug report repository are composed by duplicated bug
reports (Anvik et al., 2005; Runeson et al., 2007; Cavalcanti et al., 2008)
So, costs with
opening bug reports (5-15 minutes)
CCB analysis (5-15 minutes)
developer analysis (5-15 minutes)
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 13 / 57
Proposed solution
The proposed solution consists in a Web based application that enables
people involved with bug report search and analysis to perform such
tasks more effectively.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 14 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 15 / 57
Definition
The goal of this study was to analyze bug repositories and the activities for
searching and analyzing bug reports
with the purpose of understanding them with respect to the possible factors
that could impact on the duplication problem and their
consequences on software development
from the point of view of the researchers
in the context of software development projects
Questions
Q1: Do the projects have a considerable amount of duplicate bug reports?
Q2: Is the productivity being affected by the bug report duplication problem?
Q3: Is there a common vocabulary for bug report descriptions?
Q4: How are the relationships between master bug reports and duplicate bug
reports characterized?
Q5: Does the type of bug report influence the amount of duplicates?
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 16 / 57
Definition
The goal of this study was to analyze bug repositories and the activities for
searching and analyzing bug reports
with the purpose of understanding them with respect to the possible factors
that could impact on the duplication problem and their
consequences on software development
from the point of view of the researchers
in the context of software development projects
Questions
Q1: Do the projects have a considerable amount of duplicate bug reports?
Q2: Is the productivity being affected by the bug report duplication problem?
Q3: Is there a common vocabulary for bug report descriptions?
Q4: How are the relationships between master bug reports and duplicate bug
reports characterized?
Q5: Does the type of bug report influence the amount of duplicates?
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 16 / 57
Planning and operation
Projects and data selection
All bug reports till June/2008
Project LOC Staff size Bugs Life-time
Bugzilla 55K 340 12829 14
Eclipse 6.5M 352 130095 7
Epiphany 100K 19 10683 6
Evolution 1M 156 72646 11
Firefox 80K 514 60233 9
GCC 4.2M 285 35797 9
Thunderbird 310K 192 19204 8
Tomcat 200K 57 8293 8
Private Project 2M 21 7955 2
Performed at C.E.S.A.R. between June/2008 to August/2008
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 17 / 57
Results
Question 1: Do the analyzed projects have a considerable amount of
duplicate bug reports?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M1 % 23.32 19.44 31.52 43.24 38.39 17.68 49.10 8.24 21.59 28.1 13.4
Question 2: Is the submitters productivity being affected by the bug report
duplication problem?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M2 (min) 05-15 – 05-15 05-15 05-10 05-15 05-15 – 20-30 12.5 1.88
M4 bugs per day 71 722 59 403 334 198 106 46 145 231.5 222.1
Question 3: Is there a common vocabulary for bug report descriptions?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M5 % – 25 – – 22 – – – 35 31.2 9.5
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 18 / 57
Results
Question 1: Do the analyzed projects have a considerable amount of
duplicate bug reports?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M1 % 23.32 19.44 31.52 43.24 38.39 17.68 49.10 8.24 21.59 28.1 13.4
Question 2: Is the submitters productivity being affected by the bug report
duplication problem?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M2 (min) 05-15 – 05-15 05-15 05-10 05-15 05-15 – 20-30 12.5 1.88
M4 bugs per day 71 722 59 403 334 198 106 46 145 231.5 222.1
Question 3: Is there a common vocabulary for bug report descriptions?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M5 % – 25 – – 22 – – – 35 31.2 9.5
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 18 / 57
Results
Question 1: Do the analyzed projects have a considerable amount of
duplicate bug reports?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M1 % 23.32 19.44 31.52 43.24 38.39 17.68 49.10 8.24 21.59 28.1 13.4
Question 2: Is the submitters productivity being affected by the bug report
duplication problem?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M2 (min) 05-15 – 05-15 05-15 05-10 05-15 05-15 – 20-30 12.5 1.88
M4 bugs per day 71 722 59 403 334 198 106 46 145 231.5 222.1
Question 3: Is there a common vocabulary for bug report descriptions?
Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD
M5 % – 25 – – 22 – – – 35 31.2 9.5
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 18 / 57
Results [2]
Question 4: How are the relationships between master bug reports and
duplicate bug reports characterized?
One to one relation
bug123: bug3453
One to many relation
bug345: bug45345,
bug465, bug654
Figure: Bug reports grouping.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 19 / 57
Results [3]
Question 5: Does the type of bug report influence the amount of duplicates?
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 20 / 57
Study summary
All the projects are being affected by the bug report duplication problem;
The productivity is being affected by the bug reports duplication problem;
It is not used a common vocabulary to describe the bug reports;
> 80% of the groups are composed by one-to-one grouping type;
The bug report duplication occur independently of the type of bug reports;
The number of LOC is not a factor for the duplication problem;
The size of the repository is not a factor for duplication;
Projects’ life-time is not a factor for duplication;
The staff size (developers) is not a factor for the duplication problem;
and
The profile of the submitter is a determining factor for the submission of
duplicates: sporadic ≥ average ≥ frequent
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 21 / 57
Study summary
All the projects are being affected by the bug report duplication problem;
The productivity is being affected by the bug reports duplication problem;
It is not used a common vocabulary to describe the bug reports;
> 80% of the groups are composed by one-to-one grouping type;
The bug report duplication occur independently of the type of bug reports;
The number of LOC is not a factor for the duplication problem;
The size of the repository is not a factor for duplication;
Projects’ life-time is not a factor for duplication;
The staff size (developers) is not a factor for the duplication problem;
and
The profile of the submitter is a determining factor for the submission of
duplicates: sporadic ≥ average ≥ frequent
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 21 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 22 / 57
Requirements
Functional requirements
FR1 - Keyword-based search
FR2 - Rank search results based
on bug reports similarity rate
FR3 - Index bug reports from XML
files
FR4 - Index bug reports from
original database
FR5 - Extract useful information
from bug reports
Non-Functional requirements
NFR1 - Simple and intuitive filters
interface
NFR2 - Reports about bug
repository status
NFR3 - Integration with most
popular bug report tracking
systems
NFR4 - Log search queries and
user actions
NFR5 - Reasonable similarity rate
NFR6 - Web-based interface with
AJAX
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 23 / 57
Requirements
Functional requirements
FR1 - Keyword-based search
FR2 - Rank search results based
on bug reports similarity rate
FR3 - Index bug reports from XML
files
FR4 - Index bug reports from
original database
FR5 - Extract useful information
from bug reports
Non-Functional requirements
NFR1 - Simple and intuitive filters
interface
NFR2 - Reports about bug
repository status
NFR3 - Integration with most
popular bug report tracking
systems
NFR4 - Log search queries and
user actions
NFR5 - Reasonable similarity rate
NFR6 - Web-based interface with
AJAX
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 23 / 57
Architecture
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 24 / 57
Overview
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 25 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 26 / 57
Definition
Context. Performed in a real test cycle at a C.E.S.A.R. partner
between July and August 2008
Systematic process to test and open bug reports
Objectives. 1 Which can prevent more duplicate bug reports
2 To consider whether our tool decreases the time spent on
analysis of bug reports
Baseline tool. Internal tool where testers can search for bug reports using
SQL filters.
Null hypotheses
H0: µtime with BAST > µtime with baseline
µduplicates avoided with BAST < µduplicates avoided with baseline
Alternative hypotheses
H1: µtime with BAST < µtime with baseline
µduplicates avoided with BAST > µduplicates avoided with baseline
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 27 / 57
Definition
Context. Performed in a real test cycle at a C.E.S.A.R. partner
between July and August 2008
Systematic process to test and open bug reports
Objectives. 1 Which can prevent more duplicate bug reports
2 To consider whether our tool decreases the time spent on
analysis of bug reports
Baseline tool. Internal tool where testers can search for bug reports using
SQL filters.
Null hypotheses
H0: µtime with BAST > µtime with baseline
µduplicates avoided with BAST < µduplicates avoided with baseline
Alternative hypotheses
H1: µtime with BAST < µtime with baseline
µduplicates avoided with BAST > µduplicates avoided with baseline
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 27 / 57
Planning
The tool was tested by the Bug Report Master
Responsible for the test cycle
Most experienced tester
Doubt should be saned with him
Case study design: Search and analysis being performed in:
1 step. Internal tool =⇒ BAST
2 step. BAST =⇒ Internal tool
Metrics (manual annotations):
Type of bug reports analyzed
Number of duplicate bug reports avoided
Time spent to analyze similar bug reports
Quantitative analysis: Descriptive statistics
It were analyzed 144 bug reports
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 28 / 57
Analysis and interpretation
Repository status
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 29 / 57
Analysis and interpretation [2]
Duplicates found
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 30 / 57
Analysis and interpretation [3]
Time spent
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 31 / 57
Case study summary
Bug tracker status. More than 50% of duplicates
Duplicates found. Our tool can prevent more duplicates than the
baseline tool
Time spent. The bug report master saved time using our tool
Drawbacks
Case study design. Accommodation of the subject, in which he prefers
to use one tool instead of other.
Amount of bug reports in treatments. The amounts of bug reports that
were analyzed in each treatment were very different.
Lack of subjects. The number of subjects was not sufficient to
generalize the case study results.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 32 / 57
Case study summary
Bug tracker status. More than 50% of duplicates
Duplicates found. Our tool can prevent more duplicates than the
baseline tool
Time spent. The bug report master saved time using our tool
Drawbacks
Case study design. Accommodation of the subject, in which he prefers
to use one tool instead of other.
Amount of bug reports in treatments. The amounts of bug reports that
were analyzed in each treatment were very different.
Lack of subjects. The number of subjects was not sufficient to
generalize the case study results.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 32 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 33 / 57
Definition
The goal of this experiment was to analyze a tool to improve search and
analysis of bug reports
with the purpose of evaluating it with respect to its effectiveness and efficiency
on detection of duplicate bug reports and time saving
from the point of view of the researchers
in the context of software development projects
Questions
Q1 Is there a reduction on the number of duplicated bug reports
with the new tool adoption?
Q2 Is there a reduction on the time that submitters spend to perform
the search and analysis of bug reports with the tool adoption?
Q3 Did the submitters have difficulties to use the tool?
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 34 / 57
Definition
The goal of this experiment was to analyze a tool to improve search and
analysis of bug reports
with the purpose of evaluating it with respect to its effectiveness and efficiency
on detection of duplicate bug reports and time saving
from the point of view of the researchers
in the context of software development projects
Questions
Q1 Is there a reduction on the number of duplicated bug reports
with the new tool adoption?
Q2 Is there a reduction on the time that submitters spend to perform
the search and analysis of bug reports with the tool adoption?
Q3 Did the submitters have difficulties to use the tool?
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 34 / 57
Definition [2]
Objects of study: BAST and Bugzilla.
Quality focus: Effectiveness and efficiency of the tool developed.
Context: The adoption of a tool developed to aid the bug report tracking
process, focusing on search and analysis of bug report to avoid
duplicates.
Experiment type: Off-line experiment (Wohlin et al., 2000)
Subjects: 18 Ph.D. and M.Sc. students from the Computer Science
department at Federal University of Pernambuco/Brazil
Performed distributed (no place restrictions)
Bug reports from Firefox open-source project
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 35 / 57
Planning
Subjects selection. Selected by convenience sampling (Wohlin et al.,
2000; Kitchenham and Pfleeger, 2002)
Instrumentation: 32 error descriptions concerning Firefox project
50% with defects that already have bug reports describing them in the
repository
50% with unique/not-reported defects
Guidelines to guide the experiment execution (FAQ)
Time-sheets to collect the time with search and analysis
Quantitative analysis: Descriptive statistics and hypothesis testing
[test-t (Wohlin et al., 2000)]
Qualitative analysis: Questionnaire
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 36 / 57
Planning [2]
Null hypothesis
H0: µtime with BAST > µtime with baseline
µduplicates avoided with BAST < µduplicates avoided with baseline
Alternative hypothesis
H1: µtime with BAST < µtime with baseline
µduplicates avoided with BAST > µduplicates avoided with baseline
Independent variables. The tool used (BAST or Bugzilla)
Dependent variables. (a) amount of duplicate bug reports and (b) the
time spent with search and analysis
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 37 / 57
Planning [3]
Experiment design
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 38 / 57
Analysis and interpretation
Descriptive statistics
Time spent on analysis Bug-reports avoided
BAST Bugzilla BAST Bugzilla
Mean 4.54 4.32 7.56 8.33
Maximum 6.84 9.56 13 12
Minimum 1.78 2.47 0 0
SD 1.49 1.91 3.5 3.2
Table: Descriptive statistics.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 39 / 57
Analysis and interpretation [2]
Descriptive statistics [2]
Figure: Box plot for time spent.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 40 / 57
Analysis and interpretation [3]
Descriptive statistics [3]
Figure: Box plot for duplicates avoided.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 41 / 57
Analysis and interpretation [4]
Hypothesis test
Time spent on analysis Duplicates avoided
t0 0.6292 -1.2466
Degrees of freedom 17 17
p-value 0.5376 0.2294
T distribution 2.11 2.11
Result (t0 > T) H0: not rejected H0: not rejected
Analysis of dependency
BAST time Bugzilla time BAST duplicates Bugzilla duplicates
Years of experience -0.13 -0.02 -0.19 0.18
Number of projects -0.11 0.37 -0.28 -0.025
Bug trackers used -0.16 0.35 -0.26 0.05
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 42 / 57
Analysis and interpretation [4]
Hypothesis test
Time spent on analysis Duplicates avoided
t0 0.6292 -1.2466
Degrees of freedom 17 17
p-value 0.5376 0.2294
T distribution 2.11 2.11
Result (t0 > T) H0: not rejected H0: not rejected
Analysis of dependency
BAST time Bugzilla time BAST duplicates Bugzilla duplicates
Years of experience -0.13 -0.02 -0.19 0.18
Number of projects -0.11 0.37 -0.28 -0.025
Bug trackers used -0.16 0.35 -0.26 0.05
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 42 / 57
Qualitative analysis
BAST features. Seven (7) used the filter features provided by the tool.
BAST Usability. Only one mentioned some difficult to use the filters, and only
one subject had problem with ordering features.
BAST usefulness. Fifteen (15) subjects believe that the way as bug report
details are presented in BAST is useful for the analysis, more than Bugzilla.
Testimonials
“in fact, the way details are presented saves time to check them, since it is not
necessary to open extra tabs or windows to see the details”, and other wrote “it
became easier to identify the duplicate bug reports and navigate among the
details of the them”.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 43 / 57
Qualitative analysis
BAST features. Seven (7) used the filter features provided by the tool.
BAST Usability. Only one mentioned some difficult to use the filters, and only
one subject had problem with ordering features.
BAST usefulness. Fifteen (15) subjects believe that the way as bug report
details are presented in BAST is useful for the analysis, more than Bugzilla.
Testimonials
“in fact, the way details are presented saves time to check them, since it is not
necessary to open extra tabs or windows to see the details”, and other wrote “it
became easier to identify the duplicate bug reports and navigate among the
details of the them”.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 43 / 57
Validity Threats
Boredom
Lack of Historical Data
Environment
Subjects Knowledge on bug reports
Errors re-descriptions and fictitious errors
Halo Effect
Internet Connection Constraints
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 44 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 45 / 57
Related work
Automated Support for Classifying Software Failure Reports
(Podgurski et al., 2003)
Bug reports: Software failures automatically submitted
Technique: Supervised and unsupervised pattern classification and
multivariate visualization
Testing: Batch runs
Dataset: GCC, Jikes, and JavaC
Assisted Detection of Duplicate Bug Reports (Hiew, 2006)
Bug reports: Natural language bug reports
Technique: Organize similar bug reports into centroids using TF-IDF
Testing: Batch runs
Dataset: Firefox, Eclipse, Apache, and Fedora Core
Results: Precision of 29% and recall of 50%
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 46 / 57
Related work [2]
Detection of Duplicate Defect Reports Using Natural Language
Processing (Runeson et al., 2007)
Bug reports: Natural language bug reports
Technique: Natural Language Processing (NLP)
Testing: Batch runs and a tool
Dataset: Sony Ericsson Mobile Communications
Results: Recall of 40%
An Approach to Detecting Duplicate Bug Reports Using Natural
Language and Execution Information (Wang et al., 2008)
Bug reports: Natural language bug reports
Technique: NLP and execution information
Testing: Batch runs
Dataset: Firefox and Eclipse
Results: Recall of 67%-93% at its best
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 47 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 48 / 57
Research contribution
A taxonomy for the bug repositories mining area
The state-of-the-art on mining bug repositories
A characterization of the bug report duplication problem
A tool to reduce the time spent with search and analysis of bug
reports
A case study to evaluate the tool proposed;
An experiment with 18 subjects to evaluate the tool
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 49 / 57
Papers
Cavalcanti, Y. C., Martins, A. C., de Almeida, E. S., and de Lemos Meira,
S. R. (2008a). Avoiding Duplicate CR reports in Open Source Software
Projects. In The 9th International Free Software Forum (IFSF’08), Porto
Alegre, Brazil.
Cavalcanti, Y. C., de Almeida, E. S., da Cunha, C. E. A., Pinto, E. R., and
Meira, S. R. L. (2008b). The Bug Report Duplication Problem: A
Characterization Study. Technical report, C.E.S.A.R and Federal
University of Pernambuco.
Papers for the Case Study and for the Experiment
And more two journal papers being written (characterization and thesis)
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 50 / 57
Future Work
Evolve from prototype
Information visualization
Alternative integration methods
Provide integration with other
tools
Search and raking techniques
Comments of a bug report
Number of informal references
Experiment replications
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 51 / 57
Outline
1 Introduction
M.Sc. Context, Motivation, Proposed solution
2 The Bug Report Duplication Problem: A Characterization Study
Definition, Planning and Operation, Results
3 BAST
Requirements, Architecture, Overview
4 Case Study
Definition, Planning, Analysis and interpretation
5 Experiment
Definition, Planning, Analysis and interpretation
6 Related Work
7 Conclusion
8 References
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 52 / 57
References I
Anvik, J., Hiew, L., and Murphy, G. C. (2005). Coping with an open bug
repository. In Proceedings of the 2005 OOPSLA workshop on Eclipse
technology eXchange, pages 35–39, New York, NY, USA. ACM Press.
Anvik, J., Hiew, L., and Murphy, G. C. (2006). Who should fix this bug? In
Proceeding of the 28th International Conference on Software Engineering
(ICSE’06), pages 361–370, New York, NY, USA. ACM Press.
Cavalcanti, Y. C., Almeida, E. S., da Cunha, C. E. A., Pinto, E. R., and Meira,
S. R. L. (2008). The bug-report duplication problem: a characterization
study. Technical report, C.E.S.A.R and Federal University of Pernambuco.
Eastwood, A. (1993). Firm fires shots at legacy systems. Computing Canada,
19(2), 17.
Erlikh, L. (2000). Leveraging legacy system dollars for e-business. IT
Professional, 2(3), 17–23.
Hiew, L. (2006). Assisted Detection of Duplicate Bug Reports. Master’s thesis,
The University of British Columbia.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 53 / 57
References II
Huff, F. (1990). Information systems maintenance. The Business Quarterly,
(55), 30–32.
Kitchenham, B. and Pfleeger, S. L. (2002). Principles of survey research: part
5: populations and samples. SIGSOFT Software Engineering Notes, 27(5),
17–20.
Ko, A. J., Myers, B. A., and Chau, D. H. (2006). A linguistic analysis of how
people describe software problems. In Proceedings of the Visual
Languages and Human-Centric Computing (VLHCC’06), pages 127–134,
Washington, DC, USA. IEEE Computer Science.
Koskinen, J. (2004). Software maintenance costs.
http://www.cs.jyu.fi/~koskinen/smcosts.htm.
Lientz, B. P. and Swanson, E. B. (1981). Problems in application software
maintenance. Communications of the ACM, 24(11), 763–769.
McKee, J. R. (1984). Maintenance as a function of design. In AFIPS National
Conference Proceeding, volume 53, pages 187–1983.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 54 / 57
References III
Moad, J. (1990). Maintaining the competitive edge. Datamation, 4(36), 61–62.
Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., and Wang,
B. (2003). Automated support for classifying software failure reports. In
Proceedings of the 25th International Conference on Software Engineering
(ICSE’03), pages 465–475, Washington, DC, USA. IEEE Computer Society.
Port, O. (1988). The software trap – automate or else. Business Week,
9(3051), 142–154.
Runeson, P., Alexandersson, M., and Nyholm, O. (2007). Detection of
duplicate defect reports using natural language processing. In Proceedings
of the 29th International Conference on Software Engineering (ICSE’07),
pages 499–510. IEEE Computer Science Press.
Sandusky, R. J., Gasser, L., and Ripoche, G. (2004). Bug report networks:
Varieties, strategies, and impacts in a f/oss development community. In
Proceedings of the 1st International Workshop on Mining Software
Repositories (MSR’04), pages 80–84, University of Waterloo, Waterloo.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 55 / 57
References IV
Sommerville, I. (2007). Software Engineering. Addison Wesley, 8 edition.
Song, Q., Shepperd, M. J., Cartwright, M., and Mair, C. (2006). Software
defect association mining and defect correction effort prediction. IEEE
Transactions on Software Engineering, 32(2), 69–82.
Wang, X., Zhang, L., Xie, T., Anvik, J., and Sun, J. (2008). An approach to
detecting duplicate bug reports using natural language and execution
information. In Proceedings of the 13th International Conference on
Software Engineering (ICSE’08), pages 461–470. ACM Press.
Wohlin, C., Runeson, P., Martin Höst, M. C. O., Regnell, B., and Wesslén, A.
(2000). Experimentation in Software Engineering: An Introduction. The
Kluwer Internation Series in Software Engineering. Kluwer Academic
Publishers, Norwell, Massachusets, USA.
Zelkowitz, M. V., Shaw, A. C., and Gannon, J. D. (1979). Principles of Software
Engineering and Design. Prentice Hall Professional Technical Reference.
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 56 / 57
A Bug Report Analysis and Search Tool
M.Sc. Presentation
Yguaratã Cerqueira Cavalcanti
yguarata@gmail.com
Advisor: Silvio Romero de Lemos Meira
Co-Advisor: Eduardo Santana de Almeida
Center for Informatics – Federal University of Pernambuco (UFPE)
http://www.cin.ufpe.br
Reuse in Software Engineering (RiSE)
http://www.rise.com.br
07/03/2009, Recife – Brazil
Yguaraṭ Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife РBrazil 57 / 57

Weitere ähnliche Inhalte

Was ist angesagt?

The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...Ali Ouni
 
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewAli Ouni
 
Recommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software EnginneringRecommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software EnginneringAli Ouni
 
An Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming PracticesAn Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming PracticesGabriel Moreira
 
Technology & innovation Management Course - Session 2
Technology & innovation Management Course - Session 2Technology & innovation Management Course - Session 2
Technology & innovation Management Course - Session 2Dan Toma
 
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...Ali Ouni
 
Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...
Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...
Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...Kieran Alden
 
A Mono- and Multi-objective Approach for Recommending Software Refactoring
A Mono- and Multi-objective Approach for Recommending Software RefactoringA Mono- and Multi-objective Approach for Recommending Software Refactoring
A Mono- and Multi-objective Approach for Recommending Software RefactoringAli Ouni
 
Assessing the Reliability of a Human Estimator
Assessing the Reliability of a Human EstimatorAssessing the Reliability of a Human Estimator
Assessing the Reliability of a Human EstimatorTim Menzies
 
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTESSOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTESsuthi
 
A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...
A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...
A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...Kevin Moran
 
A suite of tools for technology assessment
A suite of tools for technology assessmentA suite of tools for technology assessment
A suite of tools for technology assessmentNitish Mahajan
 
An Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security EducationAn Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security EducationXiao Qin
 
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
130321   zephyrin soh - on the effect of exploration strategies on maintenanc...130321   zephyrin soh - on the effect of exploration strategies on maintenanc...
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...Ptidej Team
 

Was ist angesagt? (20)

The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
 
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
 
Technology Readiness
Technology ReadinessTechnology Readiness
Technology Readiness
 
Recommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software EnginneringRecommending Software Refactoring Using Search-based Software Enginnering
Recommending Software Refactoring Using Search-based Software Enginnering
 
Virtual Qualification
Virtual QualificationVirtual Qualification
Virtual Qualification
 
An Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming PracticesAn Investigation Of EXtreme Programming Practices
An Investigation Of EXtreme Programming Practices
 
Technology & innovation Management Course - Session 2
Technology & innovation Management Course - Session 2Technology & innovation Management Course - Session 2
Technology & innovation Management Course - Session 2
 
Mary_Deepthy
Mary_DeepthyMary_Deepthy
Mary_Deepthy
 
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...
 
Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...
Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...
Tools for Building Confidence in Using Simulation To Inform or Replace Real-W...
 
Trl and value chain
Trl and value chainTrl and value chain
Trl and value chain
 
A Mono- and Multi-objective Approach for Recommending Software Refactoring
A Mono- and Multi-objective Approach for Recommending Software RefactoringA Mono- and Multi-objective Approach for Recommending Software Refactoring
A Mono- and Multi-objective Approach for Recommending Software Refactoring
 
Assessing the Reliability of a Human Estimator
Assessing the Reliability of a Human EstimatorAssessing the Reliability of a Human Estimator
Assessing the Reliability of a Human Estimator
 
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTESSOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
SOFTWARE QUALITY ASSURANCE AND TESTING - SHORT NOTES
 
A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...
A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...
A Large-Scale Empirical Comparison of Static and DynamicTest Case Prioritizat...
 
A suite of tools for technology assessment
A suite of tools for technology assessmentA suite of tools for technology assessment
A suite of tools for technology assessment
 
An Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security EducationAn Application-Oriented Approach for Computer Security Education
An Application-Oriented Approach for Computer Security Education
 
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
130321   zephyrin soh - on the effect of exploration strategies on maintenanc...130321   zephyrin soh - on the effect of exploration strategies on maintenanc...
130321 zephyrin soh - on the effect of exploration strategies on maintenanc...
 
Ssbse12b.ppt
Ssbse12b.pptSsbse12b.ppt
Ssbse12b.ppt
 
Fehlmann and Kranich - Measuring tests using cosmic
Fehlmann and Kranich - Measuring tests using cosmicFehlmann and Kranich - Measuring tests using cosmic
Fehlmann and Kranich - Measuring tests using cosmic
 

Ähnlich wie A Bug Report Analysis and Search Tool (presentation for M.Sc. degree)

When do software issues get reported in large open source software - Rakesh Rana
When do software issues get reported in large open source software - Rakesh RanaWhen do software issues get reported in large open source software - Rakesh Rana
When do software issues get reported in large open source software - Rakesh RanaIWSM Mensura
 
When do software issues get reported in large open source software
When do software issues get reported in large open source softwareWhen do software issues get reported in large open source software
When do software issues get reported in large open source softwareRAKESH RANA
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated ProcessIRJET Journal
 
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...ESEM 2014
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET Journal
 
Defect Management Practices and Problems in Free/Open Source Software Projects
Defect Management Practices and Problems in Free/Open Source Software ProjectsDefect Management Practices and Problems in Free/Open Source Software Projects
Defect Management Practices and Problems in Free/Open Source Software ProjectsWaqas Tariq
 
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...CSCJournals
 
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSUSING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSijseajournal
 
Defect Prediction: Accomplishments and Future Challenges
Defect Prediction: Accomplishments and Future ChallengesDefect Prediction: Accomplishments and Future Challenges
Defect Prediction: Accomplishments and Future ChallengesYasutaka Kamei
 
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"CS, NcState
 
survey on analysing the crash reports of software applications
survey on analysing the crash reports of software applicationssurvey on analysing the crash reports of software applications
survey on analysing the crash reports of software applicationsIRJET Journal
 
Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...
Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...
Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...Michel Alves
 
A Complexity Based Regression Test Selection Strategy
A Complexity Based Regression Test Selection StrategyA Complexity Based Regression Test Selection Strategy
A Complexity Based Regression Test Selection StrategyCSEIJJournal
 
Workshop BI/DWH AGILE TESTING SNS Bank English
Workshop BI/DWH AGILE TESTING SNS Bank EnglishWorkshop BI/DWH AGILE TESTING SNS Bank English
Workshop BI/DWH AGILE TESTING SNS Bank EnglishMarcus Drost
 
Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...
Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...
Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...INFOGAIN PUBLICATION
 
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...iosrjce
 

Ähnlich wie A Bug Report Analysis and Search Tool (presentation for M.Sc. degree) (20)

When do software issues get reported in large open source software - Rakesh Rana
When do software issues get reported in large open source software - Rakesh RanaWhen do software issues get reported in large open source software - Rakesh Rana
When do software issues get reported in large open source software - Rakesh Rana
 
When do software issues get reported in large open source software
When do software issues get reported in large open source softwareWhen do software issues get reported in large open source software
When do software issues get reported in large open source software
 
Bug Triage: An Automated Process
Bug Triage: An Automated ProcessBug Triage: An Automated Process
Bug Triage: An Automated Process
 
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
 
Defect Management Practices and Problems in Free/Open Source Software Projects
Defect Management Practices and Problems in Free/Open Source Software ProjectsDefect Management Practices and Problems in Free/Open Source Software Projects
Defect Management Practices and Problems in Free/Open Source Software Projects
 
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
Software Defect Trend Forecasting In Open Source Projects using A Univariate ...
 
M018147883
M018147883M018147883
M018147883
 
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTSUSING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
 
Defect Prediction: Accomplishments and Future Challenges
Defect Prediction: Accomplishments and Future ChallengesDefect Prediction: Accomplishments and Future Challenges
Defect Prediction: Accomplishments and Future Challenges
 
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
 
IJET-V2I6P28
IJET-V2I6P28IJET-V2I6P28
IJET-V2I6P28
 
survey on analysing the crash reports of software applications
survey on analysing the crash reports of software applicationssurvey on analysing the crash reports of software applications
survey on analysing the crash reports of software applications
 
Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...
Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...
Five Minute Speech: An Overview of Activities Developed in Disciplines and Gu...
 
A Complexity Based Regression Test Selection Strategy
A Complexity Based Regression Test Selection StrategyA Complexity Based Regression Test Selection Strategy
A Complexity Based Regression Test Selection Strategy
 
CGIAR Consortium/System Office - Monitoring, Evaluation and Learning
CGIAR Consortium/System Office - Monitoring, Evaluation and Learning CGIAR Consortium/System Office - Monitoring, Evaluation and Learning
CGIAR Consortium/System Office - Monitoring, Evaluation and Learning
 
Workshop BI/DWH AGILE TESTING SNS Bank English
Workshop BI/DWH AGILE TESTING SNS Bank EnglishWorkshop BI/DWH AGILE TESTING SNS Bank English
Workshop BI/DWH AGILE TESTING SNS Bank English
 
Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...
Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...
Software CrashLocator: Locating the Faulty Functions by Analyzing the Crash S...
 
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
 
F017652530
F017652530F017652530
F017652530
 

Kürzlich hochgeladen

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 

Kürzlich hochgeladen (20)

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 

A Bug Report Analysis and Search Tool (presentation for M.Sc. degree)

  • 1. A Bug Report Analysis and Search Tool M.Sc. Presentation Yguaratã Cerqueira Cavalcanti yguarata@gmail.com Advisor: Silvio Romero de Lemos Meira Co-Advisor: Eduardo Santana de Almeida Center for Informatics – Federal University of Pernambuco (UFPE) http://www.cin.ufpe.br Reuse in Software Engineering (RiSE) http://www.rise.com.br 07/03/2009, Recife – Brazil Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 1 / 57
  • 2. Summary 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 2 / 57
  • 3. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 3 / 57
  • 4. M.Sc. Context Change management handles requests for: new features correction of errors improvements It drives the software maintenance and evolution Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 4 / 57
  • 5. M.Sc. Context Change management handles requests for: new features correction of errors improvements It drives the software maintenance and evolution Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 4 / 57
  • 6. Motivation Software maintenance and evolution are characterised by their huge cost and slow speed of implementation Sommerville says that it takes almost 90% of costs Year Total costs Reference 2000 >90% Erlikh (2000) 1993 75% Eastwood (1993) 1990 >90% Moad (1990) 1990 60–70% Huff (1990) 1988 60–70% Port (1988) 1984 65–75% McKee (1984) 1981 >50% Lientz and Swanson (1981) 1979 67% Zelkowitz et al. (1979) Table: Conducted studies about software maintenance costs (Koskinen, 2004). Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 5 / 57
  • 7. Bug tracking activity Bug reports management Verify bug report validity Analyze the impact of a bug report Assign a developer Help with development process in general Bug reports Software artifact that describes some defect or enhancement; Generally, bug report submitters are developers, users, or testers Bug trackers Bug trackers are used to manage, store and handle change requests (also known as bug reports) Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 6 / 57
  • 8. Bug tracking activity Bug reports management Verify bug report validity Analyze the impact of a bug report Assign a developer Help with development process in general Bug reports Software artifact that describes some defect or enhancement; Generally, bug report submitters are developers, users, or testers Bug trackers Bug trackers are used to manage, store and handle change requests (also known as bug reports) Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 6 / 57
  • 9. Bug trackers advantages Traceability (developers, releases) Fast identification of problems Metrics (errors per developers, to identify critical components, etc) Comments Project history Examples: Mantis, Bugzilla, Trac, Jyra Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 7 / 57
  • 10. A bug report example Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 8 / 57
  • 11. A bug report example [2] Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 9 / 57
  • 12. A bug report example [3] Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 10 / 57
  • 13. A bug report example [4] Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 11 / 57
  • 14. Issues coming from bug trackers Dynamic assignment of bug reports (Anvik et al., 2006); Change impact analysis and effort estimation of new bug reports (Song et al., 2006); Quality of bug report descriptions (Ko et al., 2006); Software evolution traceability (Sandusky et al., 2004); and Duplicate bug reports detection consists in avoiding the submission of bug reports that describe the submitted issue (Hiew, 2006). Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 12 / 57
  • 15. The bug report duplication problem Characterized by the submission of two or more bug reports that describe the same software issue Overhead of rework to search and analyze bug reports People take almost 5-15 minutes to perform search and analysis (Anvik et al., 2005; Cavalcanti et al., 2008) 10% to 30% of a bug report repository are composed by duplicated bug reports (Anvik et al., 2005; Runeson et al., 2007; Cavalcanti et al., 2008) So, costs with opening bug reports (5-15 minutes) CCB analysis (5-15 minutes) developer analysis (5-15 minutes) Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 13 / 57
  • 16. Proposed solution The proposed solution consists in a Web based application that enables people involved with bug report search and analysis to perform such tasks more effectively. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 14 / 57
  • 17. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 15 / 57
  • 18. Definition The goal of this study was to analyze bug repositories and the activities for searching and analyzing bug reports with the purpose of understanding them with respect to the possible factors that could impact on the duplication problem and their consequences on software development from the point of view of the researchers in the context of software development projects Questions Q1: Do the projects have a considerable amount of duplicate bug reports? Q2: Is the productivity being affected by the bug report duplication problem? Q3: Is there a common vocabulary for bug report descriptions? Q4: How are the relationships between master bug reports and duplicate bug reports characterized? Q5: Does the type of bug report influence the amount of duplicates? Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 16 / 57
  • 19. Definition The goal of this study was to analyze bug repositories and the activities for searching and analyzing bug reports with the purpose of understanding them with respect to the possible factors that could impact on the duplication problem and their consequences on software development from the point of view of the researchers in the context of software development projects Questions Q1: Do the projects have a considerable amount of duplicate bug reports? Q2: Is the productivity being affected by the bug report duplication problem? Q3: Is there a common vocabulary for bug report descriptions? Q4: How are the relationships between master bug reports and duplicate bug reports characterized? Q5: Does the type of bug report influence the amount of duplicates? Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 16 / 57
  • 20. Planning and operation Projects and data selection All bug reports till June/2008 Project LOC Staff size Bugs Life-time Bugzilla 55K 340 12829 14 Eclipse 6.5M 352 130095 7 Epiphany 100K 19 10683 6 Evolution 1M 156 72646 11 Firefox 80K 514 60233 9 GCC 4.2M 285 35797 9 Thunderbird 310K 192 19204 8 Tomcat 200K 57 8293 8 Private Project 2M 21 7955 2 Performed at C.E.S.A.R. between June/2008 to August/2008 Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 17 / 57
  • 21. Results Question 1: Do the analyzed projects have a considerable amount of duplicate bug reports? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M1 % 23.32 19.44 31.52 43.24 38.39 17.68 49.10 8.24 21.59 28.1 13.4 Question 2: Is the submitters productivity being affected by the bug report duplication problem? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M2 (min) 05-15 – 05-15 05-15 05-10 05-15 05-15 – 20-30 12.5 1.88 M4 bugs per day 71 722 59 403 334 198 106 46 145 231.5 222.1 Question 3: Is there a common vocabulary for bug report descriptions? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M5 % – 25 – – 22 – – – 35 31.2 9.5 Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 18 / 57
  • 22. Results Question 1: Do the analyzed projects have a considerable amount of duplicate bug reports? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M1 % 23.32 19.44 31.52 43.24 38.39 17.68 49.10 8.24 21.59 28.1 13.4 Question 2: Is the submitters productivity being affected by the bug report duplication problem? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M2 (min) 05-15 – 05-15 05-15 05-10 05-15 05-15 – 20-30 12.5 1.88 M4 bugs per day 71 722 59 403 334 198 106 46 145 231.5 222.1 Question 3: Is there a common vocabulary for bug report descriptions? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M5 % – 25 – – 22 – – – 35 31.2 9.5 Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 18 / 57
  • 23. Results Question 1: Do the analyzed projects have a considerable amount of duplicate bug reports? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M1 % 23.32 19.44 31.52 43.24 38.39 17.68 49.10 8.24 21.59 28.1 13.4 Question 2: Is the submitters productivity being affected by the bug report duplication problem? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M2 (min) 05-15 – 05-15 05-15 05-10 05-15 05-15 – 20-30 12.5 1.88 M4 bugs per day 71 722 59 403 334 198 106 46 145 231.5 222.1 Question 3: Is there a common vocabulary for bug report descriptions? Metric Bugz. Eclip. Epiph. Evol. Firef. GCC Thund. Tomc. Private Proj. Mean SD M5 % – 25 – – 22 – – – 35 31.2 9.5 Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 18 / 57
  • 24. Results [2] Question 4: How are the relationships between master bug reports and duplicate bug reports characterized? One to one relation bug123: bug3453 One to many relation bug345: bug45345, bug465, bug654 Figure: Bug reports grouping. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 19 / 57
  • 25. Results [3] Question 5: Does the type of bug report influence the amount of duplicates? Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 20 / 57
  • 26. Study summary All the projects are being affected by the bug report duplication problem; The productivity is being affected by the bug reports duplication problem; It is not used a common vocabulary to describe the bug reports; > 80% of the groups are composed by one-to-one grouping type; The bug report duplication occur independently of the type of bug reports; The number of LOC is not a factor for the duplication problem; The size of the repository is not a factor for duplication; Projects’ life-time is not a factor for duplication; The staff size (developers) is not a factor for the duplication problem; and The profile of the submitter is a determining factor for the submission of duplicates: sporadic ≥ average ≥ frequent Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 21 / 57
  • 27. Study summary All the projects are being affected by the bug report duplication problem; The productivity is being affected by the bug reports duplication problem; It is not used a common vocabulary to describe the bug reports; > 80% of the groups are composed by one-to-one grouping type; The bug report duplication occur independently of the type of bug reports; The number of LOC is not a factor for the duplication problem; The size of the repository is not a factor for duplication; Projects’ life-time is not a factor for duplication; The staff size (developers) is not a factor for the duplication problem; and The profile of the submitter is a determining factor for the submission of duplicates: sporadic ≥ average ≥ frequent Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 21 / 57
  • 28. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 22 / 57
  • 29. Requirements Functional requirements FR1 - Keyword-based search FR2 - Rank search results based on bug reports similarity rate FR3 - Index bug reports from XML files FR4 - Index bug reports from original database FR5 - Extract useful information from bug reports Non-Functional requirements NFR1 - Simple and intuitive filters interface NFR2 - Reports about bug repository status NFR3 - Integration with most popular bug report tracking systems NFR4 - Log search queries and user actions NFR5 - Reasonable similarity rate NFR6 - Web-based interface with AJAX Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 23 / 57
  • 30. Requirements Functional requirements FR1 - Keyword-based search FR2 - Rank search results based on bug reports similarity rate FR3 - Index bug reports from XML files FR4 - Index bug reports from original database FR5 - Extract useful information from bug reports Non-Functional requirements NFR1 - Simple and intuitive filters interface NFR2 - Reports about bug repository status NFR3 - Integration with most popular bug report tracking systems NFR4 - Log search queries and user actions NFR5 - Reasonable similarity rate NFR6 - Web-based interface with AJAX Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 23 / 57
  • 31. Architecture Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 24 / 57
  • 32. Overview Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 25 / 57
  • 33. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 26 / 57
  • 34. Definition Context. Performed in a real test cycle at a C.E.S.A.R. partner between July and August 2008 Systematic process to test and open bug reports Objectives. 1 Which can prevent more duplicate bug reports 2 To consider whether our tool decreases the time spent on analysis of bug reports Baseline tool. Internal tool where testers can search for bug reports using SQL filters. Null hypotheses H0: µtime with BAST > µtime with baseline µduplicates avoided with BAST < µduplicates avoided with baseline Alternative hypotheses H1: µtime with BAST < µtime with baseline µduplicates avoided with BAST > µduplicates avoided with baseline Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 27 / 57
  • 35. Definition Context. Performed in a real test cycle at a C.E.S.A.R. partner between July and August 2008 Systematic process to test and open bug reports Objectives. 1 Which can prevent more duplicate bug reports 2 To consider whether our tool decreases the time spent on analysis of bug reports Baseline tool. Internal tool where testers can search for bug reports using SQL filters. Null hypotheses H0: µtime with BAST > µtime with baseline µduplicates avoided with BAST < µduplicates avoided with baseline Alternative hypotheses H1: µtime with BAST < µtime with baseline µduplicates avoided with BAST > µduplicates avoided with baseline Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 27 / 57
  • 36. Planning The tool was tested by the Bug Report Master Responsible for the test cycle Most experienced tester Doubt should be saned with him Case study design: Search and analysis being performed in: 1 step. Internal tool =⇒ BAST 2 step. BAST =⇒ Internal tool Metrics (manual annotations): Type of bug reports analyzed Number of duplicate bug reports avoided Time spent to analyze similar bug reports Quantitative analysis: Descriptive statistics It were analyzed 144 bug reports Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 28 / 57
  • 37. Analysis and interpretation Repository status Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 29 / 57
  • 38. Analysis and interpretation [2] Duplicates found Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 30 / 57
  • 39. Analysis and interpretation [3] Time spent Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 31 / 57
  • 40. Case study summary Bug tracker status. More than 50% of duplicates Duplicates found. Our tool can prevent more duplicates than the baseline tool Time spent. The bug report master saved time using our tool Drawbacks Case study design. Accommodation of the subject, in which he prefers to use one tool instead of other. Amount of bug reports in treatments. The amounts of bug reports that were analyzed in each treatment were very different. Lack of subjects. The number of subjects was not sufficient to generalize the case study results. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 32 / 57
  • 41. Case study summary Bug tracker status. More than 50% of duplicates Duplicates found. Our tool can prevent more duplicates than the baseline tool Time spent. The bug report master saved time using our tool Drawbacks Case study design. Accommodation of the subject, in which he prefers to use one tool instead of other. Amount of bug reports in treatments. The amounts of bug reports that were analyzed in each treatment were very different. Lack of subjects. The number of subjects was not sufficient to generalize the case study results. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 32 / 57
  • 42. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 33 / 57
  • 43. Definition The goal of this experiment was to analyze a tool to improve search and analysis of bug reports with the purpose of evaluating it with respect to its effectiveness and efficiency on detection of duplicate bug reports and time saving from the point of view of the researchers in the context of software development projects Questions Q1 Is there a reduction on the number of duplicated bug reports with the new tool adoption? Q2 Is there a reduction on the time that submitters spend to perform the search and analysis of bug reports with the tool adoption? Q3 Did the submitters have difficulties to use the tool? Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 34 / 57
  • 44. Definition The goal of this experiment was to analyze a tool to improve search and analysis of bug reports with the purpose of evaluating it with respect to its effectiveness and efficiency on detection of duplicate bug reports and time saving from the point of view of the researchers in the context of software development projects Questions Q1 Is there a reduction on the number of duplicated bug reports with the new tool adoption? Q2 Is there a reduction on the time that submitters spend to perform the search and analysis of bug reports with the tool adoption? Q3 Did the submitters have difficulties to use the tool? Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 34 / 57
  • 45. Definition [2] Objects of study: BAST and Bugzilla. Quality focus: Effectiveness and efficiency of the tool developed. Context: The adoption of a tool developed to aid the bug report tracking process, focusing on search and analysis of bug report to avoid duplicates. Experiment type: Off-line experiment (Wohlin et al., 2000) Subjects: 18 Ph.D. and M.Sc. students from the Computer Science department at Federal University of Pernambuco/Brazil Performed distributed (no place restrictions) Bug reports from Firefox open-source project Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 35 / 57
  • 46. Planning Subjects selection. Selected by convenience sampling (Wohlin et al., 2000; Kitchenham and Pfleeger, 2002) Instrumentation: 32 error descriptions concerning Firefox project 50% with defects that already have bug reports describing them in the repository 50% with unique/not-reported defects Guidelines to guide the experiment execution (FAQ) Time-sheets to collect the time with search and analysis Quantitative analysis: Descriptive statistics and hypothesis testing [test-t (Wohlin et al., 2000)] Qualitative analysis: Questionnaire Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 36 / 57
  • 47. Planning [2] Null hypothesis H0: µtime with BAST > µtime with baseline µduplicates avoided with BAST < µduplicates avoided with baseline Alternative hypothesis H1: µtime with BAST < µtime with baseline µduplicates avoided with BAST > µduplicates avoided with baseline Independent variables. The tool used (BAST or Bugzilla) Dependent variables. (a) amount of duplicate bug reports and (b) the time spent with search and analysis Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 37 / 57
  • 48. Planning [3] Experiment design Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 38 / 57
  • 49. Analysis and interpretation Descriptive statistics Time spent on analysis Bug-reports avoided BAST Bugzilla BAST Bugzilla Mean 4.54 4.32 7.56 8.33 Maximum 6.84 9.56 13 12 Minimum 1.78 2.47 0 0 SD 1.49 1.91 3.5 3.2 Table: Descriptive statistics. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 39 / 57
  • 50. Analysis and interpretation [2] Descriptive statistics [2] Figure: Box plot for time spent. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 40 / 57
  • 51. Analysis and interpretation [3] Descriptive statistics [3] Figure: Box plot for duplicates avoided. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 41 / 57
  • 52. Analysis and interpretation [4] Hypothesis test Time spent on analysis Duplicates avoided t0 0.6292 -1.2466 Degrees of freedom 17 17 p-value 0.5376 0.2294 T distribution 2.11 2.11 Result (t0 > T) H0: not rejected H0: not rejected Analysis of dependency BAST time Bugzilla time BAST duplicates Bugzilla duplicates Years of experience -0.13 -0.02 -0.19 0.18 Number of projects -0.11 0.37 -0.28 -0.025 Bug trackers used -0.16 0.35 -0.26 0.05 Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 42 / 57
  • 53. Analysis and interpretation [4] Hypothesis test Time spent on analysis Duplicates avoided t0 0.6292 -1.2466 Degrees of freedom 17 17 p-value 0.5376 0.2294 T distribution 2.11 2.11 Result (t0 > T) H0: not rejected H0: not rejected Analysis of dependency BAST time Bugzilla time BAST duplicates Bugzilla duplicates Years of experience -0.13 -0.02 -0.19 0.18 Number of projects -0.11 0.37 -0.28 -0.025 Bug trackers used -0.16 0.35 -0.26 0.05 Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 42 / 57
  • 54. Qualitative analysis BAST features. Seven (7) used the filter features provided by the tool. BAST Usability. Only one mentioned some difficult to use the filters, and only one subject had problem with ordering features. BAST usefulness. Fifteen (15) subjects believe that the way as bug report details are presented in BAST is useful for the analysis, more than Bugzilla. Testimonials “in fact, the way details are presented saves time to check them, since it is not necessary to open extra tabs or windows to see the details”, and other wrote “it became easier to identify the duplicate bug reports and navigate among the details of the them”. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 43 / 57
  • 55. Qualitative analysis BAST features. Seven (7) used the filter features provided by the tool. BAST Usability. Only one mentioned some difficult to use the filters, and only one subject had problem with ordering features. BAST usefulness. Fifteen (15) subjects believe that the way as bug report details are presented in BAST is useful for the analysis, more than Bugzilla. Testimonials “in fact, the way details are presented saves time to check them, since it is not necessary to open extra tabs or windows to see the details”, and other wrote “it became easier to identify the duplicate bug reports and navigate among the details of the them”. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 43 / 57
  • 56. Validity Threats Boredom Lack of Historical Data Environment Subjects Knowledge on bug reports Errors re-descriptions and fictitious errors Halo Effect Internet Connection Constraints Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 44 / 57
  • 57. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 45 / 57
  • 58. Related work Automated Support for Classifying Software Failure Reports (Podgurski et al., 2003) Bug reports: Software failures automatically submitted Technique: Supervised and unsupervised pattern classification and multivariate visualization Testing: Batch runs Dataset: GCC, Jikes, and JavaC Assisted Detection of Duplicate Bug Reports (Hiew, 2006) Bug reports: Natural language bug reports Technique: Organize similar bug reports into centroids using TF-IDF Testing: Batch runs Dataset: Firefox, Eclipse, Apache, and Fedora Core Results: Precision of 29% and recall of 50% Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 46 / 57
  • 59. Related work [2] Detection of Duplicate Defect Reports Using Natural Language Processing (Runeson et al., 2007) Bug reports: Natural language bug reports Technique: Natural Language Processing (NLP) Testing: Batch runs and a tool Dataset: Sony Ericsson Mobile Communications Results: Recall of 40% An Approach to Detecting Duplicate Bug Reports Using Natural Language and Execution Information (Wang et al., 2008) Bug reports: Natural language bug reports Technique: NLP and execution information Testing: Batch runs Dataset: Firefox and Eclipse Results: Recall of 67%-93% at its best Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 47 / 57
  • 60. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 48 / 57
  • 61. Research contribution A taxonomy for the bug repositories mining area The state-of-the-art on mining bug repositories A characterization of the bug report duplication problem A tool to reduce the time spent with search and analysis of bug reports A case study to evaluate the tool proposed; An experiment with 18 subjects to evaluate the tool Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 49 / 57
  • 62. Papers Cavalcanti, Y. C., Martins, A. C., de Almeida, E. S., and de Lemos Meira, S. R. (2008a). Avoiding Duplicate CR reports in Open Source Software Projects. In The 9th International Free Software Forum (IFSF’08), Porto Alegre, Brazil. Cavalcanti, Y. C., de Almeida, E. S., da Cunha, C. E. A., Pinto, E. R., and Meira, S. R. L. (2008b). The Bug Report Duplication Problem: A Characterization Study. Technical report, C.E.S.A.R and Federal University of Pernambuco. Papers for the Case Study and for the Experiment And more two journal papers being written (characterization and thesis) Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 50 / 57
  • 63. Future Work Evolve from prototype Information visualization Alternative integration methods Provide integration with other tools Search and raking techniques Comments of a bug report Number of informal references Experiment replications Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 51 / 57
  • 64. Outline 1 Introduction M.Sc. Context, Motivation, Proposed solution 2 The Bug Report Duplication Problem: A Characterization Study Definition, Planning and Operation, Results 3 BAST Requirements, Architecture, Overview 4 Case Study Definition, Planning, Analysis and interpretation 5 Experiment Definition, Planning, Analysis and interpretation 6 Related Work 7 Conclusion 8 References Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 52 / 57
  • 65. References I Anvik, J., Hiew, L., and Murphy, G. C. (2005). Coping with an open bug repository. In Proceedings of the 2005 OOPSLA workshop on Eclipse technology eXchange, pages 35–39, New York, NY, USA. ACM Press. Anvik, J., Hiew, L., and Murphy, G. C. (2006). Who should fix this bug? In Proceeding of the 28th International Conference on Software Engineering (ICSE’06), pages 361–370, New York, NY, USA. ACM Press. Cavalcanti, Y. C., Almeida, E. S., da Cunha, C. E. A., Pinto, E. R., and Meira, S. R. L. (2008). The bug-report duplication problem: a characterization study. Technical report, C.E.S.A.R and Federal University of Pernambuco. Eastwood, A. (1993). Firm fires shots at legacy systems. Computing Canada, 19(2), 17. Erlikh, L. (2000). Leveraging legacy system dollars for e-business. IT Professional, 2(3), 17–23. Hiew, L. (2006). Assisted Detection of Duplicate Bug Reports. Master’s thesis, The University of British Columbia. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 53 / 57
  • 66. References II Huff, F. (1990). Information systems maintenance. The Business Quarterly, (55), 30–32. Kitchenham, B. and Pfleeger, S. L. (2002). Principles of survey research: part 5: populations and samples. SIGSOFT Software Engineering Notes, 27(5), 17–20. Ko, A. J., Myers, B. A., and Chau, D. H. (2006). A linguistic analysis of how people describe software problems. In Proceedings of the Visual Languages and Human-Centric Computing (VLHCC’06), pages 127–134, Washington, DC, USA. IEEE Computer Science. Koskinen, J. (2004). Software maintenance costs. http://www.cs.jyu.fi/~koskinen/smcosts.htm. Lientz, B. P. and Swanson, E. B. (1981). Problems in application software maintenance. Communications of the ACM, 24(11), 763–769. McKee, J. R. (1984). Maintenance as a function of design. In AFIPS National Conference Proceeding, volume 53, pages 187–1983. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 54 / 57
  • 67. References III Moad, J. (1990). Maintaining the competitive edge. Datamation, 4(36), 61–62. Podgurski, A., Leon, D., Francis, P., Masri, W., Minch, M., Sun, J., and Wang, B. (2003). Automated support for classifying software failure reports. In Proceedings of the 25th International Conference on Software Engineering (ICSE’03), pages 465–475, Washington, DC, USA. IEEE Computer Society. Port, O. (1988). The software trap – automate or else. Business Week, 9(3051), 142–154. Runeson, P., Alexandersson, M., and Nyholm, O. (2007). Detection of duplicate defect reports using natural language processing. In Proceedings of the 29th International Conference on Software Engineering (ICSE’07), pages 499–510. IEEE Computer Science Press. Sandusky, R. J., Gasser, L., and Ripoche, G. (2004). Bug report networks: Varieties, strategies, and impacts in a f/oss development community. In Proceedings of the 1st International Workshop on Mining Software Repositories (MSR’04), pages 80–84, University of Waterloo, Waterloo. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 55 / 57
  • 68. References IV Sommerville, I. (2007). Software Engineering. Addison Wesley, 8 edition. Song, Q., Shepperd, M. J., Cartwright, M., and Mair, C. (2006). Software defect association mining and defect correction effort prediction. IEEE Transactions on Software Engineering, 32(2), 69–82. Wang, X., Zhang, L., Xie, T., Anvik, J., and Sun, J. (2008). An approach to detecting duplicate bug reports using natural language and execution information. In Proceedings of the 13th International Conference on Software Engineering (ICSE’08), pages 461–470. ACM Press. Wohlin, C., Runeson, P., Martin Höst, M. C. O., Regnell, B., and Wesslén, A. (2000). Experimentation in Software Engineering: An Introduction. The Kluwer Internation Series in Software Engineering. Kluwer Academic Publishers, Norwell, Massachusets, USA. Zelkowitz, M. V., Shaw, A. C., and Gannon, J. D. (1979). Principles of Software Engineering and Design. Prentice Hall Professional Technical Reference. Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 56 / 57
  • 69. A Bug Report Analysis and Search Tool M.Sc. Presentation Yguaratã Cerqueira Cavalcanti yguarata@gmail.com Advisor: Silvio Romero de Lemos Meira Co-Advisor: Eduardo Santana de Almeida Center for Informatics – Federal University of Pernambuco (UFPE) http://www.cin.ufpe.br Reuse in Software Engineering (RiSE) http://www.rise.com.br 07/03/2009, Recife – Brazil Yguaratã Cavalcanti (UFPE/CIn, RiSE) A Bug Report Analysis and Search Tool 07/03/2009, Recife – Brazil 57 / 57