SlideShare ist ein Scribd-Unternehmen logo
1 von 34
CHANGES and BUGS
Mining and Predicting Software Development Activities




                  Thomas Zimmermann
Software development



         Build
Collaboration




Comm.      Version      Bug
Archive    Archive    Database


  Mining Software Archives
MY THESIS                                                             .
additions analysis architecture archives aspects   bug cached calls
changes collaboration complexities component concerns cross-
cutting cvs data defects design development drawing dynamine
eclipse effort evolves failures fine-grained fix fix-inducing
graphs     hatari   history locate matching method mining
predicting program programmers report repositories
revision software support system taking transactions
version visualizing
Contributions of the thesis

Fine-grained analysis of version archives.              1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)



Mining bug databases to predict defects.                2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).
Fine-grained analysis

public void createPartControl(Composite parent) {
    ...
    // add listener for editor page activation
    getSite().getPage().addPartListener(partListener);
}

public void dispose() {
    ...
    getSite().getPage().removePartListener(partListener);
}
Fine-grained analysis

public void createPartControl(Composite parent) {
    ...
    // add listener for editor page activation
    getSite().getPage().addPartListener(partListener);
}

public void dispose() {         co-added
    ...
    getSite().getPage().removePartListener(partListener);
}
Fine-grained analysis

public void createPartControl(Composite parent) {
    ...                                                      close
    // add listener for editor page activation      open
    getSite().getPage().addPartListener(partListener);      println
}

public void dispose() {          co-added
    ...
    getSite().getPage().removePartListener(partListener);
}                                                             begin




           Co-added items = patterns
Fine-grained analysis
                  public static final native void _XFree(int address);
                  public static final void XFree(int /*long*/ address) {
                        lock.lock();
                        try {
                              _XFree(address);
                        } finally {
                              lock.unlock();
                        }
                  }

                                  D IN
                              N GE I O N S
                          CHA CAT
                         1284 LO


Crosscutting changes = aspect candidates
Contributions of the thesis

Fine-grained analysis of version archives.              1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)



Mining bug databases to predict defects.                2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).
Bugs! Bugs! Bugs!
Quality assurance is limited...

   ...by time...   ...and by money.
Spent resources on the
components that need it most,
  i.e., are most likely to fail.
Indicators of defects

Code complexity              Code churn
Complex Code is more         Changes are likely to
prone to defects.            introduce new defects.



History                      Dependencies
Code with past defects is    Using compiler packages
more likely to have future   is more difficult than using
defects,                     packages for UI.
2252 Binaries
28.3 MLOC
Hypotheses

Complexity of dependency graphs                             Sub
                                                          system
correlates with the number of post-release defects (H1)    level
can predict the number of post-release defects (H2)



Network measures on dependency graphs                     Binary
correlate with the number of post-release defects (H3)     level

can predict the number of post-release defects (H4)
can indicate critical “escrow” binaries (H5)
DATA.   .
Data collection
                      six months
 Release point for
                       to collect
Windows Server 2003
                        defects



  Dependencies

Network Measures

Complexity Metrics     Defects
Centrality




Degree                         Closeness                           Betweenness
Blue binary has dependencies   Blue binary is close to all other   Blue binary connects the left
to many other binaries         binaries (only two steps)           with the right graph (bridge)
Centrality
• Degreethe number dependencies
          centrality
   -
   counts

• Closeness centrality binaries into account
   -
   takes distance to all other
   - Closeness: How close are the other binaries?
   - Reach: How many binaries can be reached (weighted)?
   - Eigenvector: similar to Pagerank
• Betweenness centrality paths through a binary
   -
   counts the number of shortest
Complexity metrics
Group                  Metrics                                 Aggregation
Module metrics         # functions in B
for a binary B         # global variables in B
                       # executable lines in f()
                       # parameters in f()
Per-function metrics                                              Total
                       # functions calling f()
for a function f()                                                Max
                       # functions called by f()
                       McCabe’s cyclomatic complexity of f()
                       # methods in C
                       # subclasses of C
OO metrics                                                        Total
                       Depth of C in the inheritance tree
for a class C                                                     Max
                       Coupling between classes
                       Cyclic coupling between classes
RESULTS.   .
Prediction


Input metrics and measures   Model        Prediction
                               PCA
                             Regression
  Metrics                                     Classification
                 SNA

 Metrics+SNA                                   Ranking
Classification


Has a binary a defect or not?




            or
Ranking


Which binaries have the most defects?




    or                or ... or
Random splits




4×50×
Classification
 (logistic regression)
Classification
            (logistic regression)




SNA increases the recall by 0.10 (at p=0.01)
  while precision remains comparable.
Ranking
          (linear regression)




SNA+METRICS increases the correlation
    by 0.10 (significant at p=0.01)
FUTURE WORK                                                        .
                                         bug cached calls
                          bug changes collaboration
additions analysis architecture archives aspects
analysis archives aspects
changes collaboration complexities component concerns cross-
complexities component concerns cross-cutting cvs data defects
cutting cvs data defects design development drawing dynamine
design development drawing eclipse erose evolves factor
eclipse effort evolvesfix-inducing fine-grained fix fix-inducing
failures fine-grained fix
                          failures
                                   fm graphs guide hatari
graphs hatari history locate matching method mining
history human matching mining networking
predicting program programmers report repositories
predicting program programmers system report repositories
revision software support
                               quality
                                        taking transactions
revision social software support system taking version
version visualizing
"Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
"Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
"Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
Contributions of the thesis

Fine-grained analysis of version archives.              1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)



Mining bug databases to predict defects.                2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).

Weitere ähnliche Inhalte

Was ist angesagt?

Cascading behavior in the networks
Cascading behavior in the networksCascading behavior in the networks
Cascading behavior in the networksVani Kandhasamy
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy Wei-Yuan Chang
 
Being an Iranian Woman Today イラン人女性として現代に生きるということ
Being an Iranian Woman Today イラン人女性として現代に生きるということBeing an Iranian Woman Today イラン人女性として現代に生きるということ
Being an Iranian Woman Today イラン人女性として現代に生きるということParisa Mehran
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Mediahome
 
PHISHING DETECTION
PHISHING DETECTIONPHISHING DETECTION
PHISHING DETECTIONumme ayesha
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22GiacomoBalloccu
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networksFrancisco Restivo
 
5.5 graph mining
5.5 graph mining5.5 graph mining
5.5 graph miningKrish_ver2
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis WorkshopData Works MD
 
Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain BGA Cyber Security
 
PageRank and Markov Chain
PageRank and Markov ChainPageRank and Markov Chain
PageRank and Markov ChainGenioAladino
 
Community Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief OverviewCommunity Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief OverviewSatyaki Sikdar
 
Jpdcs1 data leakage detection
Jpdcs1 data leakage detectionJpdcs1 data leakage detection
Jpdcs1 data leakage detectionChaitanya Kn
 
Adversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender SystemsAdversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender SystemsWQ Fan
 

Was ist angesagt? (20)

Small world
Small worldSmall world
Small world
 
Cascading behavior in the networks
Cascading behavior in the networksCascading behavior in the networks
Cascading behavior in the networks
 
Data mining with differential privacy
Data mining with differential privacy Data mining with differential privacy
Data mining with differential privacy
 
Being an Iranian Woman Today イラン人女性として現代に生きるということ
Being an Iranian Woman Today イラン人女性として現代に生きるということBeing an Iranian Woman Today イラン人女性として現代に生きるということ
Being an Iranian Woman Today イラン人女性として現代に生きるということ
 
Data mining on Social Media
Data mining on Social MediaData mining on Social Media
Data mining on Social Media
 
PHISHING DETECTION
PHISHING DETECTIONPHISHING DETECTION
PHISHING DETECTION
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
Hands on Explainable Recommender Systems with Knowledge Graphs @ RecSys22
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networks
 
5.5 graph mining
5.5 graph mining5.5 graph mining
5.5 graph mining
 
Social Network Analysis Workshop
Social Network Analysis WorkshopSocial Network Analysis Workshop
Social Network Analysis Workshop
 
Link Analysis
Link AnalysisLink Analysis
Link Analysis
 
SPADE -
SPADE - SPADE -
SPADE -
 
Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain Machine Learning in Cyber Security Domain
Machine Learning in Cyber Security Domain
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
PageRank and Markov Chain
PageRank and Markov ChainPageRank and Markov Chain
PageRank and Markov Chain
 
Community Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief OverviewCommunity Detection in Social Networks: A Brief Overview
Community Detection in Social Networks: A Brief Overview
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
Jpdcs1 data leakage detection
Jpdcs1 data leakage detectionJpdcs1 data leakage detection
Jpdcs1 data leakage detection
 
Adversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender SystemsAdversarial Attacks for Recommender Systems
Adversarial Attacks for Recommender Systems
 

Ähnlich wie Changes and Bugs: Mining and Predicting Development Activities

Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software developmentMartin Pinzger
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...Stefano Dalla Palma
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your CodeNate Abele
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0Nate Abele
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureMasud Rahman
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedDennis de Greef
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Michel Wermelinger
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosaPharo
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningGuido A. Ciollaro
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz QuestionsGanesh Samarthyam
 
Linq To The Enterprise
Linq To The EnterpriseLinq To The Enterprise
Linq To The EnterpriseDaniel Egan
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability predictionJULIO GONZALEZ SANZ
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Martin Pinzger
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsRoberto Natella
 

Ähnlich wie Changes and Bugs: Mining and Predicting Development Activities (20)

Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explained
 
Of Bugs and Men
Of Bugs and MenOf Bugs and Men
Of Bugs and Men
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)
 
CSMR06a.ppt
CSMR06a.pptCSMR06a.ppt
CSMR06a.ppt
 
MSR Asia Summit
MSR Asia SummitMSR Asia Summit
MSR Asia Summit
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine Learning
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Linq To The Enterprise
Linq To The EnterpriseLinq To The Enterprise
Linq To The Enterprise
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability prediction
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
 

Mehr von Thomas Zimmermann

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing InformationThomas Zimmermann
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsThomas Zimmermann
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development Thomas Zimmermann
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedThomas Zimmermann
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user researchThomas Zimmermann
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsThomas Zimmermann
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchThomas Zimmermann
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic modelsThomas Zimmermann
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedThomas Zimmermann
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect predictionThomas Zimmermann
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceThomas Zimmermann
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Thomas Zimmermann
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringThomas Zimmermann
 
Mining Workspace Updates in CVS
Mining Workspace Updates in CVSMining Workspace Updates in CVS
Mining Workspace Updates in CVSThomas Zimmermann
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentThomas Zimmermann
 

Mehr von Thomas Zimmermann (20)

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
 
Mining Workspace Updates in CVS
Mining Workspace Updates in CVSMining Workspace Updates in CVS
Mining Workspace Updates in CVS
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
 
Unit testing with JUnit
Unit testing with JUnitUnit testing with JUnit
Unit testing with JUnit
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Changes and Bugs: Mining and Predicting Development Activities

  • 1. CHANGES and BUGS Mining and Predicting Software Development Activities Thomas Zimmermann
  • 3. Collaboration Comm. Version Bug Archive Archive Database Mining Software Archives
  • 4. MY THESIS . additions analysis architecture archives aspects bug cached calls changes collaboration complexities component concerns cross- cutting cvs data defects design development drawing dynamine eclipse effort evolves failures fine-grained fix fix-inducing graphs hatari history locate matching method mining predicting program programmers report repositories revision software support system taking transactions version visualizing
  • 5. Contributions of the thesis Fine-grained analysis of version archives. 1 Project-specific usage patterns of methods (FSE 2005) Identification of cross-cutting changes (ASE 2006) Mining bug databases to predict defects. 2 Dependencies predict defects (ISSRE 2007, ICSE 2008) Domino effect: depending on defect-prone binaries increases the chances of having defects (Software Evolution 2008).
  • 6. Fine-grained analysis public void createPartControl(Composite parent) { ... // add listener for editor page activation getSite().getPage().addPartListener(partListener); } public void dispose() { ... getSite().getPage().removePartListener(partListener); }
  • 7. Fine-grained analysis public void createPartControl(Composite parent) { ... // add listener for editor page activation getSite().getPage().addPartListener(partListener); } public void dispose() { co-added ... getSite().getPage().removePartListener(partListener); }
  • 8. Fine-grained analysis public void createPartControl(Composite parent) { ... close // add listener for editor page activation open getSite().getPage().addPartListener(partListener); println } public void dispose() { co-added ... getSite().getPage().removePartListener(partListener); } begin Co-added items = patterns
  • 9. Fine-grained analysis public static final native void _XFree(int address); public static final void XFree(int /*long*/ address) { lock.lock(); try { _XFree(address); } finally { lock.unlock(); } } D IN N GE I O N S CHA CAT 1284 LO Crosscutting changes = aspect candidates
  • 10. Contributions of the thesis Fine-grained analysis of version archives. 1 Project-specific usage patterns of methods (FSE 2005) Identification of cross-cutting changes (ASE 2006) Mining bug databases to predict defects. 2 Dependencies predict defects (ISSRE 2007, ICSE 2008) Domino effect: depending on defect-prone binaries increases the chances of having defects (Software Evolution 2008).
  • 12. Quality assurance is limited... ...by time... ...and by money.
  • 13. Spent resources on the components that need it most, i.e., are most likely to fail.
  • 14. Indicators of defects Code complexity Code churn Complex Code is more Changes are likely to prone to defects. introduce new defects. History Dependencies Code with past defects is Using compiler packages more likely to have future is more difficult than using defects, packages for UI.
  • 16. Hypotheses Complexity of dependency graphs Sub system correlates with the number of post-release defects (H1) level can predict the number of post-release defects (H2) Network measures on dependency graphs Binary correlate with the number of post-release defects (H3) level can predict the number of post-release defects (H4) can indicate critical “escrow” binaries (H5)
  • 17. DATA. .
  • 18. Data collection six months Release point for to collect Windows Server 2003 defects Dependencies Network Measures Complexity Metrics Defects
  • 19. Centrality Degree Closeness Betweenness Blue binary has dependencies Blue binary is close to all other Blue binary connects the left to many other binaries binaries (only two steps) with the right graph (bridge)
  • 20. Centrality • Degreethe number dependencies centrality - counts • Closeness centrality binaries into account - takes distance to all other - Closeness: How close are the other binaries? - Reach: How many binaries can be reached (weighted)? - Eigenvector: similar to Pagerank • Betweenness centrality paths through a binary - counts the number of shortest
  • 21. Complexity metrics Group Metrics Aggregation Module metrics # functions in B for a binary B # global variables in B # executable lines in f() # parameters in f() Per-function metrics Total # functions calling f() for a function f() Max # functions called by f() McCabe’s cyclomatic complexity of f() # methods in C # subclasses of C OO metrics Total Depth of C in the inheritance tree for a class C Max Coupling between classes Cyclic coupling between classes
  • 22. RESULTS. .
  • 23. Prediction Input metrics and measures Model Prediction PCA Regression Metrics Classification SNA Metrics+SNA Ranking
  • 24. Classification Has a binary a defect or not? or
  • 25. Ranking Which binaries have the most defects? or or ... or
  • 28. Classification (logistic regression) SNA increases the recall by 0.10 (at p=0.01) while precision remains comparable.
  • 29. Ranking (linear regression) SNA+METRICS increases the correlation by 0.10 (significant at p=0.01)
  • 30. FUTURE WORK . bug cached calls bug changes collaboration additions analysis architecture archives aspects analysis archives aspects changes collaboration complexities component concerns cross- complexities component concerns cross-cutting cvs data defects cutting cvs data defects design development drawing dynamine design development drawing eclipse erose evolves factor eclipse effort evolvesfix-inducing fine-grained fix fix-inducing failures fine-grained fix failures fm graphs guide hatari graphs hatari history locate matching method mining history human matching mining networking predicting program programmers report repositories predicting program programmers system report repositories revision software support quality taking transactions revision social software support system taking version version visualizing
  • 31. "Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
  • 32. "Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
  • 33. "Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
  • 34. Contributions of the thesis Fine-grained analysis of version archives. 1 Project-specific usage patterns of methods (FSE 2005) Identification of cross-cutting changes (ASE 2006) Mining bug databases to predict defects. 2 Dependencies predict defects (ISSRE 2007, ICSE 2008) Domino effect: depending on defect-prone binaries increases the chances of having defects (Software Evolution 2008).