Filtering Bug Reports for Fix-Time Analysis

•Als KEY, PDF herunterladen•

0 gefällt mir•682 views

Several studies have experimented with data mining algorithms to predict the fix-time of reported bugs. Unfortunately, the fix-times as reported in typical open-source cases are heavily skewed with a significant amount of reports registering fix-times less than a few minutes. Consequently, we propose to include an additional filtering step to improve the quality of the underlying data in order to gain better results. Using a small-scale replication of a previously published bug fix-time prediction experiment, we show that the additional filtering of reported bugs indeed improves the outcome of the results.

Technologie News & Politik

Proceedings of the 16th European Conference on Software Maintenance and
Reengineering

Filtering Bug Reports for Fix-Time
Analysis
Ahmed Lamkanﬁ, Serge Demeyer

Antwerp Systems and Software Modelling
Ansymo 1 /13

Bug Report Fix-Time Prediction
Predicting Eclipse Bug Lifetimes A Comparative Exploration of FreeBSD Bug Lifetimes Predicting the Fix Time of Bugs
Panjer et al. Bougie et al. Giger et al.

2 /13

History of all reported bugs

Bug Database Uncover facts about history

Make predictions about future

3 /13

History of all reported bugs

Bug Database Uncover facts about history

Make predictions about future

Fix-time of a bug?
✓ Time between opening and resolving a bug.

3 /13

Cases under Study: Eclipse and Mozilla

Project Nr. of Bugs Period
Platform 76.456 Oct. 2001 - Oct. 2007

PDE 11.117 Oct. 2001 - Oct. 2007

JDT 41.691 Oct. 2001 - Oct. 2007

CDT 11.468 Oct. 2001 - Oct. 2007

GEF 1.587 Oct. 2001 - Oct. 2007

Core 143.542 Mar. 1997 - Jul. 2008

Bugzilla 19.135 Mar. 2003 - Jul. 2008

Firefox 79.272 Jul. 1999 - Jul. 2008

Thunderbird 23.408 Jan. 2000 - Jul. 2008

SeaMonkey 85.143 Nov. 1995 - Jul. 2008

4 /13

3000
1000
300
100
30
Fix−Time (logarithmic scale)

10
3
1
0.3
0.1
0.03
0.01
0.003
0.001
0.0003
0.0001

Platform PDE JDT CDT GEF

Projects
5 /13

3000
1000
300
100
30
Fix−Time (logarithmic scale)

10
3
1
0.3
0.1
0.03
0.01
0.003
0.001
0.0003
0.0001

Core Bugzilla Firefox Thunderbird Seamonkey

Projects
6 /13

Summary of the Box-Plots

Project Smallest Fix-time
Platform 10 seconds

PDE 12 seconds

JDT 10 seconds

CDT 9 seconds

GEF 8 seconds

Core 11 seconds

Bugzilla 3 seconds

Firefox 13 seconds

Thunderbird 18 seconds

SeaMonkey 14 seconds

7 /13

Ask a
developer!
➡“the developer has already the
necessary code changes ready to ﬁx a
bug, then ﬁles a bug to make sure it's
getting tracked in the system”

8 /13

Filtering out unreliable reports?
✓ How does this impact the accuracy?

9 /13

Filtering out unreliable reports?
✓ How does this impact the accuracy?

Small experiment
✓ Based on experiment from “Predicting the Fix
Time of Bugs” from Giger et al. (2010)

9 /13

Train from the history of bug reports
✓ Fields are extracted from the reports
day opened, month opened, platform, reporter,
➡
severity,...

✓ Naïve Bayes classiﬁers learns the
characteristics from the reports

✓ 10-fold cross validation

10/13

Evaluation:
✓ Receiver Operating Characteristic(ROC) curve

✓ Area Under Curve(AUC): 0.5 is random
prediction; 1.0 perfect classiﬁcation

11/13

Evaluation:
✓ Receiver Operating Characteristic(ROC) curve

✓ Area Under Curve(AUC): 0.5 is random
prediction; 1.0 perfect classiﬁcation

Two-fold experiment
✓ With and without the ﬁltering of bug reports
✓ Threshold for ﬁltering set to 1/2 of the ﬁrst
quartile

11/13

Accuracy Results

Project AUC Before AUC After
Platform 0.692 0.700

PDE 0.641 0.661

JDT 0.646 0.649

CDT 0.693 0.708

GEF 0.663 0.732

Core 0.663 0.686

Bugzilla 0.722 0.733

Firefox 0.623 0.653

Thunderbird 0.657 0.645

SeaMonkey 0.698 0.706

12/13

Conclusions
✓ More investigation needed when dealing
with real-world data

✓ Some bugs are ﬁxed conspicuously fast!
✓ More preprocessing/ﬁltering may lead to
improved results

13/13

Weitere ähnliche Inhalte

Ähnlich wie Filtering Bug Reports for Fix-Time Analysis

Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems

zuluJDK

Microservices with Micronaut

QAware GmbH

Evolution of Monitoring and Prometheus (Dublin 2018)

Brian Brazil

Continuous testing - GUERLAIS ARGOT - Air France KLM Sogeti- Soirée du Test L...

TelecomValley

Software Bugs A Software Architect Point Of View

Shahzad

[xp2013] Narrow Down What to Test

Zsolt Fabok

Fighting Fragmentation with Fragments

grunicanada

How to Design a Program Repair Bot? Insights from the Repairnator Project

Simon Urli

devConf LK 2019 Letterkenny, 23 February 2019 http://bit.ly/devConfLK2019 How do compare Visual Studio Web & Load Test with JMeter? Can I replace one with the other? How hard is this open-source tool? Do I need to install and/or learn Java? We will answer these questions and more with a practical introduction, exploring: - Basics of JMeter - Recording - Collecting and analyzing results - Tokens and parametrization - Scenarios and distributions - Setting up a test rig

How's relevant JMeter to me - DevConf (Letterkenny)

Giulio Vian

Often what you monitor and get alerted on is defined by your tools, rather than what makes the most sense to you and your organisation. Alerts on metrics such as CPU usage which are noisy and rarely spot real problems, while outages go undetected. Monitoring systems can also be challenging to maintain, and overall provide a poor return on investment. In the past few years several new monitoring systems have appeared with more powerful semantics and which are easier to run, which offer a way to vastly improve how your organisation operates and prepare you for a Cloud Native environment. Prometheus is one such system. This talk will look at the monitoring ideal and how whitebox monitoring with a time series database, multi-dimensional labels and a powerful querying/alerting language can free you from midnight pages.

An Introduction to Prometheus (GrafanaCon 2016)

Brian Brazil

Guadalajara con 2012

Jaime Restrepo

DotDotPwn es una herramienta diseñada para automatizar el proceso de búsqueda de vulnerabilidades de Directory Traversal o Escalada de Directorios. Esta escrita en Perl. Permite auditar servicios de FTP, TFTP, HTTP, o cualquier aplicación web. Ha sido incluida previamente en los repositorios de BackTrack. Actualmente se puede descargar de http://dotdotpwn.sectester.net/. Con DotDotPWN se han encontrado varias vulnerabildades en servers como: MultiThreaded HTTP Server Wing FTP Server v3.4.3 Yaws 1.89 Mongoose 2.11 VicFTPS v5.0 Home FTP Server vr1.11.1 (build 149) TFTP Desktop 2.5 TFTPDWIN v0.4.2 Actualmente contiene los siguientes módulos: HTTP HTTP URL FTP TFTP Payload (independiente del protocolo) STDOUT Los últimos cámbios incluyen lo siguiente: Parámetro -X para utilizar el algorítmo de bisección para extraer la profundidad exacta de la vulnerabilidad de directory traversal encontrada. Parámetro -M para especificar un método HTTP diferente de GET al utilizar el modulo de HTTP. Los métodos soportados son: POST, HEAD,COPY y MOVE. Parámetro -e para especificar la extensión que se agregara al final de cada prueba. (ej. “.php”, “.jpg”, “.inc”). Nuevos tipos de codificación de puntos y diagonales basado en: https://www.owasp.org/index.php/Canonicalization,_locale_and_Unicode y http://wikisecure.net/security/uri-encoding-to-bypass-idsips En GuadalajaraCON, Alejandro Hernández (@nitr0us) y Christian Navarrete (@Chr1x) nos mostrarán los últimos avances en cuanto a la herramienta. GUADALAJARACON 2012 http://www.guadalajaracon.org Guadalajara, Jalisco, México - 20 y 21 de abril del 2012

DotDotPwn v3.0 [GuadalajaraCON 2012]

Websec México, S.C.

Avionics Sensor Health Assessment is a sub-discipline of Integrated Vehicle Health Management (IVHM), which relates to the collection of sensor data, distributing it to diagnostics/prognostics algorithms, detecting run-time anomalies, and scheduling maintenance procedures. Real-time availability of the sensor health diagnostics for aircraft (manned or unmanned) subsystems allows pilots and operators to improve operational decisions. Therefore, avionics sensor health assessments are used extensively in the mil-aero domain. As avionics platforms consist of a variety of hardware and software components, standards such as Open System Architecture for Condition-Based Maintenance (OSA-CBM) have emerged to facilitate integration and interoperability. However, OSA-CBM is a platform-independent standard that provides little guidance for avionics sensor health monitoring, which requires onboard health assessment of airborne sensors in real-time. In this paper, we present a distributed architecture for avionics sensor health assessment using the Data Distribution Service (DDS), an Object Management Group (OMG) standard for developing loosely coupled high-performance real-time distributed systems. We use the data-centric publish/subscribe model supported by DDS for data acquisition, distribution, health monitoring, and presentation of diagnostics. We developed a normalized data model for exchanging the sensor and diagnostics information in a global data space in the system. Moreover, Extensible and Dynamic Topic Types (XTypes) specification allows incremental evolution of any subset of system components without disrupting the overall health monitoring system. We believe, the DDS standard and in particular RTI Connext DDS, is a viable technology for implementing OSA-CBM for avionics systems due to its real-time characteristics and extremely low resource requirements. RTI Connext DDS is being used in other major avionics programs, such as FACE™ and UCS. We evaluated our approach to sensor health assessment in a hardware-in-the-loop simulation of an Inertial Measurement Unit (IMU) onboard a simulated General Atomics MQ-9 Reaper UAV. Our proof-of-concept effectively demonstrates real-time health monitoring of avionics sensors using a Bayesian Network –based analysis running on an extremely low-power and lightweight processing unit.

An Extensible Architecture for Avionics Sensor Health Assessment Using DDS

Sumant Tambe

05 019 web-3d_service

dlian

The software development process consists of an indeterminate number of fundamental steps that together comprise the project life cycle. All of these steps carry out software testing in one form or another. Some organizations have an entire team delegated exclusively to software testing. (Royer 16-17,20) As a result, a substantial amount of a software development project’s budget is allocated solely toward testing. This establishes the need to utilize formal techniques in order to trim cost. (Amman and Black, Coverage 20) Such techniques are the subject of an ample amount of scholarly investigation and are generally classified into two complementary integration approaches (top-down and bottom-up) and fall into one of a pair of distinct methods (black-box and white-box). In this report, the distinguishing characteristics and merits of each are presented, as well as their relative disadvantages and ways to mitigate their limitations.

Software Testing: Test Design and the Project Life Cycle

Derek Callaway

As we move from monolithic applications to microservices, the ability to colocate workloads offers a tremendous opportunity to realize greater development velocity, robustness, and resource utilization. But workload colocation can also introduce performance variability and affect service levels. Google describes the problem as the “tail at scale”—the amplification of negative results observed at the tail of the latency curve when many systems are involved. With its latest tooling capabilities, Intel has an experiments framework to calculate the trade-offs between low latency and higher density. Niklas Nielsen discusses the challenges and complexities of workload colocation, why solving these challenges matters to your business no matter the size, and how Intel intends to help smarter resource allocations with its latest tooling capabilities and Kubernetes.

Solve the colocation conundrum: Performance and density at scale with Kubernetes

Niklas Quarfot Nielsen

main

Sushil Shakya

A tale of bug prediction in software development

Martin Pinzger

Manual testing interview questions

BABAR MANZAR

AngularJS - Overcoming performance issues. Limits.

Dragos Mihai Rusu

Ähnlich wie Filtering Bug Reports for Fix-Time Analysis (20)

Enabling Java in Latency Sensitive Applications by Gil Tene, CTO, Azul Systems

Microservices with Micronaut

Evolution of Monitoring and Prometheus (Dublin 2018)

Continuous testing - GUERLAIS ARGOT - Air France KLM Sogeti- Soirée du Test L...

Software Bugs A Software Architect Point Of View

[xp2013] Narrow Down What to Test

Fighting Fragmentation with Fragments

How to Design a Program Repair Bot? Insights from the Repairnator Project

How's relevant JMeter to me - DevConf (Letterkenny)

An Introduction to Prometheus (GrafanaCon 2016)

Guadalajara con 2012

DotDotPwn v3.0 [GuadalajaraCON 2012]

An Extensible Architecture for Avionics Sensor Health Assessment Using DDS

05 019 web-3d_service

Software Testing: Test Design and the Project Life Cycle

Solve the colocation conundrum: Performance and density at scale with Kubernetes

main

A tale of bug prediction in software development

Manual testing interview questions

AngularJS - Overcoming performance issues. Limits.

Kürzlich hochgeladen

Automating Google Workspace (GWS) & more with Apps Script

wesley chun

Advantages of Hiring UIUX Design Service Providers for Your Business

Pixlogix Infotech

With more memory available, system performance of three Dell devices increased, which can translate to a better user experience Conclusion When your system has plenty of RAM to meet your needs, you can efficiently access the applications and data you need to finish projects and to-do lists without sacrificing time and focus. Our test results show that with more memory available, three Dell PCs delivered better performance and took less time to complete the Procyon Office Productivity benchmark. These advantages translate to users being able to complete workflows more quickly and multitask more easily. Whether you need the mobility of the Latitude 5440, the creative capabilities of the Precision 3470, or the high performance of the OptiPlex Tower Plus 7010, configuring your system with more RAM can help keep processes running smoothly, enabling you to do more without compromising performance.

Boost PC performance: How more available memory can improve productivity

Principled Technologies

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

The Digital Insurer

Real Time Object Detection Using Open CV

Khem

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Axa Assurance Maroc - Insurer Innovation Award 2024

The Digital Insurer

Created by Mozilla Research in 2012 and now part of Linux Foundation Europe, the Servo project is an experimental rendering engine written in Rust. It combines memory safety and concurrency to create an independent, modular, and embeddable rendering engine that adheres to web standards. Stewardship of Servo moved from Mozilla Research to the Linux Foundation in 2020, where its mission remains unchanged. After some slow years, in 2023 there has been renewed activity on the project, with a roadmap now focused on improving the engine’s CSS 2 conformance, exploring Android support, and making Servo a practical embeddable rendering engine. In this presentation, Rakhi Sharma reviews the status of the project, our recent developments in 2023, our collaboration with Tauri to make Servo an easy-to-use embeddable rendering engine, and our plans for the future to make Servo an alternative web rendering engine for the embedded devices industry. (c) Embedded Open Source Summit 2024 April 16-18, 2024 Seattle, Washington (US) https://events.linuxfoundation.org/embedded-open-source-summit/ https://ossna2024.sched.com/event/1aBNF/a-year-of-servo-reboot-where-are-we-now-rakhi-sharma-igalia

A Year of the Servo Reboot: Where Are We Now?

Igalia

Tech Trends Report 2024 Future Today Institute.pdf

hans926745

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Product Anonymous

The presentation explores the development and application of artificial intelligence (AI) from its inception to its current status in the modern world. The term "artificial intelligence" was first coined by John McCarthy in 1956 to describe efforts to develop computer programs capable of performing tasks that typically require human intelligence. This concept was first introduced at a conference held at Dartmouth College, where programs demonstrated capabilities such as playing chess, proving theorems, and interpreting texts. In the early stages, Alan Turing contributed to the field by defining intelligence as the ability of a being to respond to certain questions intelligently, proposing what is now known as the Turing Test to evaluate the presence of intelligent behavior in machines. As the decades progressed, AI evolved significantly. The 1980s focused on machine learning, teaching computers to learn from data, leading to the development of models that could improve their performance based on their experiences. The 1990s and 2000s saw further advances in algorithms and computational power, which allowed for more sophisticated data analysis techniques, including data mining. By the 2010s, the proliferation of big data and the refinement of deep learning techniques enabled AI to become mainstream. Notable milestones included the success of Google's AlphaGo and advancements in autonomous vehicles by companies like Tesla and Waymo. A major theme of the presentation is the application of generative AI, which has been used for tasks such as natural language text generation, translation, and question answering. Generative AI uses large datasets to train models that can then produce new, coherent pieces of text or other media. The presentation also discusses the ethical implications and the need for regulation in AI, highlighting issues such as privacy, bias, and the potential for misuse. These concerns have prompted calls for comprehensive regulations to ensure the safe and equitable use of AI technologies. Artificial intelligence has also played a significant role in healthcare, particularly highlighted during the COVID-19 pandemic, where it was used in drug discovery, vaccine development, and analyzing the spread of the virus. The capabilities of AI in healthcare are vast, ranging from medical diagnostics to personalized medicine, demonstrating the technology's potential to revolutionize fields beyond just technical or consumer applications. In conclusion, AI continues to be a rapidly evolving field with significant implications for various aspects of society. The development from theoretical concepts to real-world applications illustrates both the potential benefits and the challenges that come with integrating advanced technologies into everyday life. The ongoing discussion about AI ethics and regulation underscores the importance of managing these technologies responsibly to maximize their their benefits while minimizing potential harms.

Artificial Intelligence: Facts and Myths

Joaquim Jorge

Finology Group – Insurtech Innovation Award 2024

The Digital Insurer

Strategies for Landing an Oracle DBA Job as a Fresher

Remote DBA Services

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Neo4j

The value of a flexible API Management solution for Open Banking Steve Melan, Manager for IT Innovation and Architecture - State's and Saving's Bank of Luxembourg Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - The value of a flexible API Management solution for O...

apidays

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Martijn de Jong

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

🐬 The future of MySQL is Postgres 🐘

RTylerCroy

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

GenAI Risks & Security Meetup 01052024.pdf

lior mazor

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script

Advantages of Hiring UIUX Design Service Providers for Your Business

Boost PC performance: How more available memory can improve productivity

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

Real Time Object Detection Using Open CV

How to Troubleshoot Apps for the Modern Connected Worker

Axa Assurance Maroc - Insurer Innovation Award 2024

A Year of the Servo Reboot: Where Are We Now?

Tech Trends Report 2024 Future Today Institute.pdf

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Artificial Intelligence: Facts and Myths

Finology Group – Insurtech Innovation Award 2024

Strategies for Landing an Oracle DBA Job as a Fresher

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Apidays New York 2024 - The value of a flexible API Management solution for O...

2024: Domino Containers - The Next Step. News from the Domino Container commu...

How to Troubleshoot Apps for the Modern Connected Worker

🐬 The future of MySQL is Postgres 🐘

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

GenAI Risks & Security Meetup 01052024.pdf

Filtering Bug Reports for Fix-Time Analysis

1. Proceedings of the 16th European Conference on Software Maintenance and Reengineering Filtering Bug Reports for Fix-Time Analysis Ahmed Lamkanﬁ, Serge Demeyer Antwerp Systems and Software Modelling Ansymo 1 /13

2. Bug Report Fix-Time Prediction Predicting Eclipse Bug Lifetimes A Comparative Exploration of FreeBSD Bug Lifetimes Predicting the Fix Time of Bugs Panjer et al. Bougie et al. Giger et al. 2 /13

3. Bug Report Fix-Time Prediction Predicting Eclipse Bug Lifetimes A Comparative Exploration of FreeBSD Bug Lifetimes Predicting the Fix Time of Bugs Panjer et al. Bougie et al. Giger et al. 2 /13

4. Bug Report Fix-Time Prediction Predicting Eclipse Bug Lifetimes A Comparative Exploration of FreeBSD Bug Lifetimes Predicting the Fix Time of Bugs Panjer et al. Bougie et al. Giger et al. 2 /13

5. History of all reported bugs Bug Database Uncover facts about history Make predictions about future 3 /13

6. History of all reported bugs Bug Database Uncover facts about history Make predictions about future Fix-time of a bug? ✓ Time between opening and resolving a bug. 3 /13

7. Cases under Study: Eclipse and Mozilla Project Nr. of Bugs Period Platform 76.456 Oct. 2001 - Oct. 2007 PDE 11.117 Oct. 2001 - Oct. 2007 JDT 41.691 Oct. 2001 - Oct. 2007 CDT 11.468 Oct. 2001 - Oct. 2007 GEF 1.587 Oct. 2001 - Oct. 2007 Core 143.542 Mar. 1997 - Jul. 2008 Bugzilla 19.135 Mar. 2003 - Jul. 2008 Firefox 79.272 Jul. 1999 - Jul. 2008 Thunderbird 23.408 Jan. 2000 - Jul. 2008 SeaMonkey 85.143 Nov. 1995 - Jul. 2008 4 /13

8. 3000 1000 300 100 30 Fix−Time (logarithmic scale) 10 3 1 0.3 0.1 0.03 0.01 0.003 0.001 0.0003 0.0001 Platform PDE JDT CDT GEF Projects 5 /13

9. 3000 1000 300 100 30 Fix−Time (logarithmic scale) 10 3 1 0.3 0.1 0.03 0.01 0.003 0.001 0.0003 0.0001 Core Bugzilla Firefox Thunderbird Seamonkey Projects 6 /13

10. Summary of the Box-Plots Project Smallest Fix-time Platform 10 seconds PDE 12 seconds JDT 10 seconds CDT 9 seconds GEF 8 seconds Core 11 seconds Bugzilla 3 seconds Firefox 13 seconds Thunderbird 18 seconds SeaMonkey 14 seconds 7 /13

11. Ask a developer! 8 /13

12. Ask a developer! ➡“the developer has already the necessary code changes ready to ﬁx a bug, then ﬁles a bug to make sure it's getting tracked in the system” 8 /13

13. Filtering out unreliable reports? ✓ How does this impact the accuracy? 9 /13

14. Filtering out unreliable reports? ✓ How does this impact the accuracy? Small experiment ✓ Based on experiment from “Predicting the Fix Time of Bugs” from Giger et al. (2010) 9 /13

15. Train from the history of bug reports ✓ Fields are extracted from the reports day opened, month opened, platform, reporter, ➡ severity,... ✓ Naïve Bayes classiﬁers learns the characteristics from the reports ✓ 10-fold cross validation 10/13

16. Train from the history of bug reports ✓ Fields are extracted from the reports day opened, month opened, platform, reporter, ➡ severity,... ✓ Naïve Bayes classifiers learns the characteristics from the reports ✓ 10-fold cross validation Bugs are grouped in two sets ✓ Fast: fixtime ≤ median ✓ Slow: fixtime > median 10/13

17. Evaluation: ✓ Receiver Operating Characteristic(ROC) curve ✓ Area Under Curve(AUC): 0.5 is random prediction; 1.0 perfect classiﬁcation 11/13

18. Evaluation: ✓ Receiver Operating Characteristic(ROC) curve ✓ Area Under Curve(AUC): 0.5 is random prediction; 1.0 perfect classification Two-fold experiment ✓ With and without the filtering of bug reports ✓ Threshold for filtering set to 1/2 of the first quartile 11/13

19. Accuracy Results Project AUC Before AUC After Platform 0.692 0.700 PDE 0.641 0.661 JDT 0.646 0.649 CDT 0.693 0.708 GEF 0.663 0.732 Core 0.663 0.686 Bugzilla 0.722 0.733 Firefox 0.623 0.653 Thunderbird 0.657 0.645 SeaMonkey 0.698 0.706 12/13

20. Conclusions ✓ More investigation needed when dealing with real-world data ✓ Some bugs are ﬁxed conspicuously fast! ✓ More preprocessing/ﬁltering may lead to improved results 13/13

Filtering Bug Reports for Fix-Time Analysis

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Filtering Bug Reports for Fix-Time Analysis

Ähnlich wie Filtering Bug Reports for Fix-Time Analysis (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Filtering Bug Reports for Fix-Time Analysis

Hinweis der Redaktion