SlideShare ist ein Scribd-Unternehmen logo
1 von 34
CHANGES and BUGS
Mining and Predicting Software Development Activities




                  Thomas Zimmermann
Software development



         Build
Collaboration




Comm.      Version      Bug
Archive    Archive    Database


  Mining Software Archives
MY THESIS                                                             .
additions analysis architecture archives aspects   bug cached calls
changes collaboration complexities component concerns cross-
cutting cvs data defects design development drawing dynamine
eclipse effort evolves failures fine-grained fix fix-inducing
graphs     hatari   history locate matching method mining
predicting program programmers report repositories
revision software support system taking transactions
version visualizing
Contributions of the thesis

Fine-grained analysis of version archives.              1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)



Mining bug databases to predict defects.                2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).
Fine-grained analysis

public void createPartControl(Composite parent) {
    ...
    // add listener for editor page activation
    getSite().getPage().addPartListener(partListener);
}

public void dispose() {
    ...
    getSite().getPage().removePartListener(partListener);
}
Fine-grained analysis

public void createPartControl(Composite parent) {
    ...
    // add listener for editor page activation
    getSite().getPage().addPartListener(partListener);
}

public void dispose() {         co-added
    ...
    getSite().getPage().removePartListener(partListener);
}
Fine-grained analysis

public void createPartControl(Composite parent) {
    ...                                                      close
    // add listener for editor page activation      open
    getSite().getPage().addPartListener(partListener);      println
}

public void dispose() {          co-added
    ...
    getSite().getPage().removePartListener(partListener);
}                                                             begin




           Co-added items = patterns
Fine-grained analysis
                  public static final native void _XFree(int address);
                  public static final void XFree(int /*long*/ address) {
                        lock.lock();
                        try {
                              _XFree(address);
                        } finally {
                              lock.unlock();
                        }
                  }

                                  D IN
                              N GE I O N S
                          CHA CAT
                         1284 LO


Crosscutting changes = aspect candidates
Contributions of the thesis

Fine-grained analysis of version archives.              1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)



Mining bug databases to predict defects.                2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).
Bugs! Bugs! Bugs!
Quality assurance is limited...

   ...by time...   ...and by money.
Spent resources on the
components that need it most,
  i.e., are most likely to fail.
Indicators of defects

Code complexity              Code churn
Complex Code is more         Changes are likely to
prone to defects.            introduce new defects.



History                      Dependencies
Code with past defects is    Using compiler packages
more likely to have future   is more difficult than using
defects,                     packages for UI.
2252 Binaries
28.3 MLOC
Hypotheses

Complexity of dependency graphs                             Sub
                                                          system
correlates with the number of post-release defects (H1)    level
can predict the number of post-release defects (H2)



Network measures on dependency graphs                     Binary
correlate with the number of post-release defects (H3)     level

can predict the number of post-release defects (H4)
can indicate critical “escrow” binaries (H5)
DATA.   .
Data collection
                      six months
 Release point for
                       to collect
Windows Server 2003
                        defects



  Dependencies

Network Measures

Complexity Metrics     Defects
Centrality




Degree                         Closeness                           Betweenness
Blue binary has dependencies   Blue binary is close to all other   Blue binary connects the left
to many other binaries         binaries (only two steps)           with the right graph (bridge)
Centrality
• Degreethe number dependencies
          centrality
   -
   counts

• Closeness centrality binaries into account
   -
   takes distance to all other
   - Closeness: How close are the other binaries?
   - Reach: How many binaries can be reached (weighted)?
   - Eigenvector: similar to Pagerank
• Betweenness centrality paths through a binary
   -
   counts the number of shortest
Complexity metrics
Group                  Metrics                                 Aggregation
Module metrics         # functions in B
for a binary B         # global variables in B
                       # executable lines in f()
                       # parameters in f()
Per-function metrics                                              Total
                       # functions calling f()
for a function f()                                                Max
                       # functions called by f()
                       McCabe’s cyclomatic complexity of f()
                       # methods in C
                       # subclasses of C
OO metrics                                                        Total
                       Depth of C in the inheritance tree
for a class C                                                     Max
                       Coupling between classes
                       Cyclic coupling between classes
RESULTS.   .
Prediction


Input metrics and measures   Model        Prediction
                               PCA
                             Regression
  Metrics                                     Classification
                 SNA

 Metrics+SNA                                   Ranking
Classification


Has a binary a defect or not?




            or
Ranking


Which binaries have the most defects?




    or                or ... or
Random splits




4×50×
Classification
 (logistic regression)
Classification
            (logistic regression)




SNA increases the recall by 0.10 (at p=0.01)
  while precision remains comparable.
Ranking
          (linear regression)




SNA+METRICS increases the correlation
    by 0.10 (significant at p=0.01)
FUTURE WORK                                                        .
                                         bug cached calls
                          bug changes collaboration
additions analysis architecture archives aspects
analysis archives aspects
changes collaboration complexities component concerns cross-
complexities component concerns cross-cutting cvs data defects
cutting cvs data defects design development drawing dynamine
design development drawing eclipse erose evolves factor
eclipse effort evolvesfix-inducing fine-grained fix fix-inducing
failures fine-grained fix
                          failures
                                   fm graphs guide hatari
graphs hatari history locate matching method mining
history human matching mining networking
predicting program programmers report repositories
predicting program programmers system report repositories
revision software support
                               quality
                                        taking transactions
revision social software support system taking version
version visualizing
"Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
"Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
"Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
Contributions of the thesis

Fine-grained analysis of version archives.              1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)



Mining bug databases to predict defects.                2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).

Weitere ähnliche Inhalte

Was ist angesagt?

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Information retrieval 9 tf idf weights
Information retrieval 9 tf idf weightsInformation retrieval 9 tf idf weights
Information retrieval 9 tf idf weightsVaibhav Khanna
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to CypherNeo4j
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clusteringKrish_ver2
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB FundamentalsMongoDB
 
Apache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeApache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeAdam Kawa
 
Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Jeet Das
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it WorksMike Dirolf
 
Cross-lingual Information Retrieval
Cross-lingual Information RetrievalCross-lingual Information Retrieval
Cross-lingual Information RetrievalShadi Saleh
 
Foundations of Machine Learning
Foundations of Machine LearningFoundations of Machine Learning
Foundations of Machine Learningmahutte
 
Neo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real WorldNeo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real WorldNeo4j
 
ntroducing to the Power of Graph Technology
ntroducing to the Power of Graph Technologyntroducing to the Power of Graph Technology
ntroducing to the Power of Graph TechnologyNeo4j
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Kyunghoon Kim
 
Module: InterPlanetary Linked Data (IPLD)
Module: InterPlanetary Linked Data (IPLD)Module: InterPlanetary Linked Data (IPLD)
Module: InterPlanetary Linked Data (IPLD)Ioannis Psaras
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithmsAlireza Andalib
 

Was ist angesagt? (20)

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Hierachical clustering
Hierachical clusteringHierachical clustering
Hierachical clustering
 
Information retrieval 9 tf idf weights
Information retrieval 9 tf idf weightsInformation retrieval 9 tf idf weights
Information retrieval 9 tf idf weights
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
3.1 clustering
3.1 clustering3.1 clustering
3.1 clustering
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 
Apache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeApache Hadoop In Theory And Practice
Apache Hadoop In Theory And Practice
 
Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)Information Retrieval-4(inverted index_&_query handling)
Information Retrieval-4(inverted index_&_query handling)
 
Chapter 12 outlier
Chapter 12 outlierChapter 12 outlier
Chapter 12 outlier
 
MongoDB: How it Works
MongoDB: How it WorksMongoDB: How it Works
MongoDB: How it Works
 
Cross-lingual Information Retrieval
Cross-lingual Information RetrievalCross-lingual Information Retrieval
Cross-lingual Information Retrieval
 
Foundations of Machine Learning
Foundations of Machine LearningFoundations of Machine Learning
Foundations of Machine Learning
 
Neo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real WorldNeo4j in Production: A look at Neo4j in the Real World
Neo4j in Production: A look at Neo4j in the Real World
 
ntroducing to the Power of Graph Technology
ntroducing to the Power of Graph Technologyntroducing to the Power of Graph Technology
ntroducing to the Power of Graph Technology
 
Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용Link prediction 방법의 개념 및 활용
Link prediction 방법의 개념 및 활용
 
Module: InterPlanetary Linked Data (IPLD)
Module: InterPlanetary Linked Data (IPLD)Module: InterPlanetary Linked Data (IPLD)
Module: InterPlanetary Linked Data (IPLD)
 
APRIORI Algorithm
APRIORI AlgorithmAPRIORI Algorithm
APRIORI Algorithm
 
Dbscan algorithom
Dbscan algorithomDbscan algorithom
Dbscan algorithom
 
Community detection algorithms
Community detection algorithmsCommunity detection algorithms
Community detection algorithms
 

Ähnlich wie Changes and Bugs: Mining and Predicting Development Activities

Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software developmentMartin Pinzger
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...Stefano Dalla Palma
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your CodeNate Abele
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0Nate Abele
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureMasud Rahman
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedDennis de Greef
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Michel Wermelinger
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosaPharo
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningGuido A. Ciollaro
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz QuestionsGanesh Samarthyam
 
Linq To The Enterprise
Linq To The EnterpriseLinq To The Enterprise
Linq To The EnterpriseDaniel Egan
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability predictionJULIO GONZALEZ SANZ
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Martin Pinzger
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsRoberto Natella
 

Ähnlich wie Changes and Bugs: Mining and Predicting Development Activities (20)

Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explained
 
Of Bugs and Men
Of Bugs and MenOf Bugs and Men
Of Bugs and Men
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)
 
CSMR06a.ppt
CSMR06a.pptCSMR06a.ppt
CSMR06a.ppt
 
MSR Asia Summit
MSR Asia SummitMSR Asia Summit
MSR Asia Summit
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
Predicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine LearningPredicting Fault-Prone Files using Machine Learning
Predicting Fault-Prone Files using Machine Learning
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Software Architecture - Quiz Questions
Software Architecture - Quiz QuestionsSoftware Architecture - Quiz Questions
Software Architecture - Quiz Questions
 
Linq To The Enterprise
Linq To The EnterpriseLinq To The Enterprise
Linq To The Enterprise
 
Bayesian network based software reliability prediction
Bayesian network based software reliability predictionBayesian network based software reliability prediction
Bayesian network based software reliability prediction
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
 

Mehr von Thomas Zimmermann

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing InformationThomas Zimmermann
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsThomas Zimmermann
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development Thomas Zimmermann
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedThomas Zimmermann
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user researchThomas Zimmermann
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsThomas Zimmermann
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchThomas Zimmermann
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic modelsThomas Zimmermann
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedThomas Zimmermann
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect predictionThomas Zimmermann
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceThomas Zimmermann
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Thomas Zimmermann
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringThomas Zimmermann
 
Mining Workspace Updates in CVS
Mining Workspace Updates in CVSMining Workspace Updates in CVS
Mining Workspace Updates in CVSThomas Zimmermann
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentThomas Zimmermann
 

Mehr von Thomas Zimmermann (20)

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
 
Mining Workspace Updates in CVS
Mining Workspace Updates in CVSMining Workspace Updates in CVS
Mining Workspace Updates in CVS
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
 
Unit testing with JUnit
Unit testing with JUnitUnit testing with JUnit
Unit testing with JUnit
 

Kürzlich hochgeladen

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Kürzlich hochgeladen (20)

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Changes and Bugs: Mining and Predicting Development Activities

  • 1. CHANGES and BUGS Mining and Predicting Software Development Activities Thomas Zimmermann
  • 3. Collaboration Comm. Version Bug Archive Archive Database Mining Software Archives
  • 4. MY THESIS . additions analysis architecture archives aspects bug cached calls changes collaboration complexities component concerns cross- cutting cvs data defects design development drawing dynamine eclipse effort evolves failures fine-grained fix fix-inducing graphs hatari history locate matching method mining predicting program programmers report repositories revision software support system taking transactions version visualizing
  • 5. Contributions of the thesis Fine-grained analysis of version archives. 1 Project-specific usage patterns of methods (FSE 2005) Identification of cross-cutting changes (ASE 2006) Mining bug databases to predict defects. 2 Dependencies predict defects (ISSRE 2007, ICSE 2008) Domino effect: depending on defect-prone binaries increases the chances of having defects (Software Evolution 2008).
  • 6. Fine-grained analysis public void createPartControl(Composite parent) { ... // add listener for editor page activation getSite().getPage().addPartListener(partListener); } public void dispose() { ... getSite().getPage().removePartListener(partListener); }
  • 7. Fine-grained analysis public void createPartControl(Composite parent) { ... // add listener for editor page activation getSite().getPage().addPartListener(partListener); } public void dispose() { co-added ... getSite().getPage().removePartListener(partListener); }
  • 8. Fine-grained analysis public void createPartControl(Composite parent) { ... close // add listener for editor page activation open getSite().getPage().addPartListener(partListener); println } public void dispose() { co-added ... getSite().getPage().removePartListener(partListener); } begin Co-added items = patterns
  • 9. Fine-grained analysis public static final native void _XFree(int address); public static final void XFree(int /*long*/ address) { lock.lock(); try { _XFree(address); } finally { lock.unlock(); } } D IN N GE I O N S CHA CAT 1284 LO Crosscutting changes = aspect candidates
  • 10. Contributions of the thesis Fine-grained analysis of version archives. 1 Project-specific usage patterns of methods (FSE 2005) Identification of cross-cutting changes (ASE 2006) Mining bug databases to predict defects. 2 Dependencies predict defects (ISSRE 2007, ICSE 2008) Domino effect: depending on defect-prone binaries increases the chances of having defects (Software Evolution 2008).
  • 12. Quality assurance is limited... ...by time... ...and by money.
  • 13. Spent resources on the components that need it most, i.e., are most likely to fail.
  • 14. Indicators of defects Code complexity Code churn Complex Code is more Changes are likely to prone to defects. introduce new defects. History Dependencies Code with past defects is Using compiler packages more likely to have future is more difficult than using defects, packages for UI.
  • 16. Hypotheses Complexity of dependency graphs Sub system correlates with the number of post-release defects (H1) level can predict the number of post-release defects (H2) Network measures on dependency graphs Binary correlate with the number of post-release defects (H3) level can predict the number of post-release defects (H4) can indicate critical “escrow” binaries (H5)
  • 17. DATA. .
  • 18. Data collection six months Release point for to collect Windows Server 2003 defects Dependencies Network Measures Complexity Metrics Defects
  • 19. Centrality Degree Closeness Betweenness Blue binary has dependencies Blue binary is close to all other Blue binary connects the left to many other binaries binaries (only two steps) with the right graph (bridge)
  • 20. Centrality • Degreethe number dependencies centrality - counts • Closeness centrality binaries into account - takes distance to all other - Closeness: How close are the other binaries? - Reach: How many binaries can be reached (weighted)? - Eigenvector: similar to Pagerank • Betweenness centrality paths through a binary - counts the number of shortest
  • 21. Complexity metrics Group Metrics Aggregation Module metrics # functions in B for a binary B # global variables in B # executable lines in f() # parameters in f() Per-function metrics Total # functions calling f() for a function f() Max # functions called by f() McCabe’s cyclomatic complexity of f() # methods in C # subclasses of C OO metrics Total Depth of C in the inheritance tree for a class C Max Coupling between classes Cyclic coupling between classes
  • 22. RESULTS. .
  • 23. Prediction Input metrics and measures Model Prediction PCA Regression Metrics Classification SNA Metrics+SNA Ranking
  • 24. Classification Has a binary a defect or not? or
  • 25. Ranking Which binaries have the most defects? or or ... or
  • 28. Classification (logistic regression) SNA increases the recall by 0.10 (at p=0.01) while precision remains comparable.
  • 29. Ranking (linear regression) SNA+METRICS increases the correlation by 0.10 (significant at p=0.01)
  • 30. FUTURE WORK . bug cached calls bug changes collaboration additions analysis architecture archives aspects analysis archives aspects changes collaboration complexities component concerns cross- complexities component concerns cross-cutting cvs data defects cutting cvs data defects design development drawing dynamine design development drawing eclipse erose evolves factor eclipse effort evolvesfix-inducing fine-grained fix fix-inducing failures fine-grained fix failures fm graphs guide hatari graphs hatari history locate matching method mining history human matching mining networking predicting program programmers report repositories predicting program programmers system report repositories revision software support quality taking transactions revision social software support system taking version version visualizing
  • 31. "Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
  • 32. "Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
  • 33. "Piled Higher and Deeper" by Jorge Cham. www.phdcomics.com
  • 34. Contributions of the thesis Fine-grained analysis of version archives. 1 Project-specific usage patterns of methods (FSE 2005) Identification of cross-cutting changes (ASE 2006) Mining bug databases to predict defects. 2 Dependencies predict defects (ISSRE 2007, ICSE 2008) Domino effect: depending on defect-prone binaries increases the chances of having defects (Software Evolution 2008).