SlideShare a Scribd company logo
1 of 44
Download to read offline
Will my system run (correctly)
after the upgrade?


Martin Pinzger
Assistant Professor
Delft University of Technology
Martin’s upgrades


       Assistant
       Professor


                             PhD




     Postdoc

                    Pfunds         2
My Experience with Software Upgrades




                                       3
4
5
Bugs on upgrades get reported




                                6
Hmm, wait a minute

Can’t we learn “something” from that data?




                                             7
Software repository mining for
preventing upgrade failures


Martin Pinzger
Assistant Professor
Delft University of Technology
Goal of software repository mining

Making the information stored in software repositories
available to software developers
  Quality analysis and defect prediction
  Recommender systems
  ...




                                                         9
Software repositories




                        10
Examples from my mining research

Predicting failure-prone source files using changes (MSR 2011)

The relationship between developer contributions and failures
(FSE 2008)




There are many more studies
  MSR 2012 http://2012.msrconf.org/
  A survey and taxonomy of approaches for mining software repositories in
  the context of software evolution, Kagdi et al. 2007

                                                                        11
Using Fine-Grained Source
Code Changes for Bug
Prediction

Joint work with Emanuel Giger, Harald Gall
University of Zurich
Bug prediction

Goal
  Train models to predict the bug-prone source files of the next release


How
  Using product measures, process measures, organizational measures with
  machine learning techniques


Many existing studies on building prediction models
  Moser et al., Nagappan et al., Zimmermann et al., Hassan et al., etc.
  Process measures performed particularly well




                                                                           13
Classical change measures

Number of file revisions

Code Churn aka lines added/deleted/changed




Research question of this study: Can we further improve these
models?




                                                                14
Revisions are coarse grained

What did change in a revision?




                                 15
Code Churn can be imprecise

Extra changes not relevant for locating bugs




                                               16
Fine Grained-Source Code Changes (SCC)

  Account.java 1.5                                    Account.java 1.6
                                                    "balance > 0 && amount <= balance"
               IF     "balance > 0"
                                                          IF



    THEN                                     THEN                 ELSE



     MI                                       MI                   MI
                                                                          notify();
"withDraw(amount);"                   "withDraw(amount);"


3 SCC: 1x condition change, 1x else-part insert, 1x invocation
statement insert                                                                      17
Research hypotheses

H1   SCC is correlated with the number of
     bugs in source files

H2   SCC is a predictor for bug-prone source
     files (and outperforms LM)

H3   SCC is a predictor for the number of bugs
     in source files (and outperforms LM)



                                                 18
15 Eclipse plug-ins

Data
 >850’000 fine-grained source code changes (SCC)
 >10’000 files
 >9’700’000 lines modified (LM)
 >9 years of development history
 ..... and a lot of bugs referenced in commit messages




                                                         19
on parametric Spearman rank correlation of           Table 5: N
 nd SCC . * is correlated with #bugsat
  H1: SCC marks significant correlations              and cate
rger values are printed bold.                          = 0.01
    Eclipse Project    LM     SCC                       Eclipse Pr
    Compare           0.68    0.76                      Compare
    jFace             0.74    0.71                      jFace
    JDT Debug         0.62    0.8                       Resource
    Resource          0.75    0.86                      Team Cor
    Runtime           0.66    0.79                      CVS Core
    Team Core         0.15    0.66                      Debug Co
    CVS Core          0.60    0.79                      Runtime
    Debug Core        0.63    0.78                      JDT Debu
    jFace Text        0.75    0.74                      jFace Text
    Update Core       0.43    0.62                      JDT Debu
    Debug UI          0.56    0.81                      Update C
                                     +/-0.5 substantial Debug UI
    JDT Debug UI      0.80    0.81
    Help              0.54    0.48   +/-0.7 strong      Help
    JDT Core          0.70    0.74                      OSGI
    OSGI              0.70    0.77   *significant        JDT Core
    Median             0.66   0.77   correlation at 0.01Mean
                                                               20
calculate and assign a probability to a file if it is bug-prone or
not bug-prone. bug-prone files
 Predicting
   For each Eclipse project we binned files into bug-prone and
not bug-prone using the median of the number of bugs per file
  Bug-prone vs. not bug-prone
(#bugs):
               ⇢
                  not bug prone : #bugs <= median
  bugClass =
                      bug prone : #bugs > median

When using the median as cut point the labeling of a file is
relative to how much bugs other files have in a project. There
exist several ways of binning files afore. They mainly vary in
that they result in different prior probabilities: For instance
Zimmerman et al. [40] and Bernstein et al. [4] labeled files as
bug-prone if they had at least one bug. When having heavily
skewed distributions this approach may lead to high a prior
probability towards a one class. Nagappan et al. [28] used a    21
UC values of E 1 using logistic regression with
CC as predictors for bug-prone and a notfiles
  H2: SCC can predict bug-prone bug-
 Larger values are printed in bold.
   Eclipse Project   AUC LM   AUC SCC
   Compare            0.84    0.85
   jFace              0.90    0.90
   JDT Debug          0.83    0.95
   Resource           0.87    0.93
   Runtime            0.83    0.91
   Team Core          0.62    0.87
   CVS Core           0.80    0.90
   Debug Core         0.86    0.94      SCC outperforms LM
   jFace Text         0.87    0.87
   Update Core        0.78    0.85
   Debug UI           0.85    0.93
   JDT Debug UI       0.90    0.91
   Help               0.75    0.70
   JDT Core           0.86    0.87
   OSGI               0.88    0.88
   Median             0.85    0.90
   Overall            0.85    0.89                           22
Predicting the number of bugs

Non linear regression with asymptotic model:
                                Team Core

               60
       #Bugs




               40




               20




                               f(#Bugs) = a1 + b2*eb3*SCC
                0
                    0   1000        2000     3000      4000
                                                              23
                                    #SCC
1.50


Table 8: Results of predict the number of of R
H3: SCC can the nonlinear regression in terms bugs
                                                            2

and Spearman correlation using LM and SCC as predictors.                          1.00




                                                                nrm. Residuals
 Project        R2 LM   R2 SCC   SpearmanLM   SpearmanSCC                          .50


 Compare         0.84    0.88        0.68          0.76
 jFace           0.74    0.79        0.74          0.71                            .00


 JDT Debug       0.69    0.68        0.62           0.8
 Resource        0.81    0.85        0.75          0.86                           -.50


 Runtime         0.69    0.72        0.66          0.79
 Team Core       0.26    0.53        0.15          0.66                          -1.00


 CVS Core        0.76    0.83        0.62          0.79
 Debug Core      0.88    0.92        0.63          0.78
 Jface Text      0.83    0.89        0.75          0.74            6,000.0


 Update Core     0.41    0.48        0.43          0.62
 Debug UI         0.7    0.79        0.56          0.81            5,000.0

 JDT Debug UI    0.82    0.82         0.8          0.81
 Help            0.66    0.67        0.54          0.84            4,000.0
 JDT Core        0.69    0.77         0.7          0.74
 OSGI            0.51     0.8        0.74          0.77
                                                                   3,000.0
 Median           0.7    0.79        0.66          0.77
 Overall         0.65    0.72        0.62          0.74
                                                                   2,000.0


SCC outperforms LM
                                                                   1,000.0 4
                                                                         2
Summary of results

SCC performs significantly better than LM
  Advanced learners are not always better
  Change types do not yield extra discriminatory power


Predicting the number of bugs is “possible”

More information
  “Comparing Fine-Grained Source Code Changes And Code Churn For Bug
  Prediction”, MSR 2011




                                                                   25
What is next?

Analysis of the effect(s) of changes
  What is the effect on the design?
  What is the effect on the quality?


Ease understanding of changes

Recommender techniques
  Models that can provide feedback on the effects




                                                    26
27
Can developer-module
networks predict failures?


Joint work with Nachi Nagappan, Brendan Murphy
Microsoft Research
Research question

Are binaries with fragmented contributions from many
developers more likely to have post-release failures?
  Should developers focus on one thing?




                                                        29
Study with MS Vista project

Data
 Released in January, 2007
 > 4 years of development
 Several thousand developers
 Several thousand binaries (*.exe, *.dll)
 Several millions of commits




                                            30
Approach in a nutshell

                      Fu                       Alice
                      6                    6
  Change      Eric        b        Bob         a       Go
                  5           2      4             2
   Logs               4                    5           7

                      Dan                      Hin     c
                                                   4




            Binary    #bugs       #centrality
              a           12         0.9
   Bugs       b           7          0.5
              c           3          0.2




                                         Regression Analysis
                                         Validation with data splitting
                                                                          31
Contribution network


                                 Windows binary (*.dll)
                                 Developer




Which binary is failure-prone?
                                                          32
Measuring fragmentation




                     Freeman degree




 Closeness           Bonacich’s power
                                        33
Research hypotheses


     Binaries with fragmented contributions
H1
     are failure-prone

     Fragmentation correlates positively with
H2
     the number of post-release failures

     Advanced fragmentation measures
H3
     improve failure estimation



                                                34
Correlation analysis

Spearman rank correlation

            nrCommits nrAuthors   Power   dPower   Closeness   Reach   Betweenness

 Failures    0,700      0,699     0,692   0,740     0,747      0,746     0,503
nrCommits               0,704     0,996   0,773     0,748      0,732     0,466
nrAuthors                         0,683   0,981     0,914      0,944     0,830
 Power                                    0,756     0,732      0,714     0,439
 dPower                                             0,943      0,964     0,772
Closeness                                                      0,990     0,738
  Reach                                                                  0,773

All correlations are significant at the 0.01 level (2-tailed)

                                                                                     35
H1: Predicting failure-prone binaries

Binary logistic regression of 50 random splits
       4 principal components from 7 centrality measures



Precision                    Recall                  AUC
1.00                         1.00                    1.00


0.90                         0.90                    0.90


0.80                         0.80                    0.80


0.70                         0.70                    0.70


0.60                         0.60                    0.60


0.50                         0.50                    0.50
        0      20     40            0   20     40           0   20   40




                                                                          36
H2: Predicting the number of failures

Linear regression of 50 random splits
           #Failures = b0 + b1*nCloseness + b2*nrAuthors + b3*nrCommits




R-Square                           Pearson                       Spearman
1.00                               1.00                          1.00


0.90                               0.90                          0.90


0.80                               0.80                          0.80


0.70                               0.70                          0.70


0.60                               0.60                          0.60


0.50                               0.50                          0.50
       0          20       40             0     20       40             0   20   40


All correlations are significant at the 0.01 level (2-tailed)
                                                                                      37
H3: Basic vs. advanced measures
  Model with nrAuthors,   Model with nCloseness,
  nrCommits               nrAuthors, nrCommits
  1.00                    1.00




                                               R-Square
  0.90                    0.90
  0.80                    0.80
  0.70                    0.70
  0.60                    0.60
  0.50                    0.50
  0.40                    0.40
  0.30                    0.30
         0   20   40             0   20   40




                                               Spearman
 1.00                     1.00
 0.90                     0.90
 0.80                     0.80
 0.70                     0.70
 0.60                     0.60
 0.50                     0.50
 0.40                     0.40
 0.30                     0.30
         0   20   40             0   20   40
                                                          38
Summary of results

Centrality measures can predict more than 83% of failure-
pone Vista binaries

Closeness, nrAuthors, and nrCommits can predict the number
of post-release failures

Closeness or Reach can improve prediction of the number of
post-release failures by 32%

More information
  Can Developer-Module Networks Predict Failures?, FSE 2008




                                                              39
What can we learn from that?

                    6                  6



               5        2       4          2
                    4                  5           7


                                               4

Increase testing effort for central binaries? - yes

Re-factor central binaries? - maybe

Re-organize contributions? - maybe
                                                       40
What is next?

Analysis of the contributions of a developer
  Who is working on which parts of the system?
  What exactly is the contribution of a developer?
  Who is introducing bugs/smells and how can we avoid it?


Global distributed software engineering
  What are the contributions of teams, smells and how to avoid it?
  Can we empirically prove Conway’s Law?

Expert recommendation
  Whom to ask for advice on a piece of code?



                                                                     41
Ideas for software upgrade research

1. Mining software repositories to identify the upgrade-critical
components
  What are the characteristics of such components?
    Product and process measures
  What are the characteristics of the target environments?
    Hardware, operating system, configuration
  Train a model with these characteristics and reported bugs




                                                               42
Further ideas for research

Who is upgrading which applications when?
  Study upgrade behavior of users?


What is the environment of the users when they upgrade?
  Where did it work, where did it fail?
  Collect crash reports for software upgrades?


Upgrades in distributed applications?
  Finding the optimal time when to upgrade which component?




                                                              43
Conclusions
                        Team Core

        60

                                                      6            6
#Bugs




        40


                                                  5       2    4       2
                                                      4            5           7
        20




                                                                           4
         0
             0   1000      2000     3000   4000

                          #SCC




                                                              Questions?
                                                            Martin Pinzger
                                                      m.pinzger@tudelft.nl
                                                                                   44

More Related Content

Similar to Keynote HotSWUp 2012

Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality	Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality ICSM 2010
 
Static analysis as means of improving code quality
Static analysis as means of improving code quality Static analysis as means of improving code quality
Static analysis as means of improving code quality Andrey Karpov
 
Cocomo ii estimation
Cocomo ii estimationCocomo ii estimation
Cocomo ii estimationjujin1810
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldAndrey Karpov
 
The Little Unicorn That Could
The Little Unicorn That CouldThe Little Unicorn That Could
The Little Unicorn That CouldPVS-Studio
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsRoberto Natella
 
Automatic Fine-Grained Issue Report Reclassification
Automatic Fine-Grained Issue Report ReclassificationAutomatic Fine-Grained Issue Report Reclassification
Automatic Fine-Grained Issue Report ReclassificationPavneet Singh Kochhar
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVMJohn Lee
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Michel Wermelinger
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect predictionThomas Zimmermann
 
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 StreamsSafe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 StreamsRaffi Khatchadourian
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelchk49
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017Andrey Karpov
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
 
Integrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data MiningIntegrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data MiningRajarshi Guha
 
Update from android kk to android l
Update from android kk to android lUpdate from android kk to android l
Update from android kk to android lBin Yang
 
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...Akihiro Hayashi
 

Similar to Keynote HotSWUp 2012 (20)

poster_3.0
poster_3.0poster_3.0
poster_3.0
 
Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality	Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality
 
Static analysis as means of improving code quality
Static analysis as means of improving code quality Static analysis as means of improving code quality
Static analysis as means of improving code quality
 
Cocomo ii estimation
Cocomo ii estimationCocomo ii estimation
Cocomo ii estimation
 
SAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security worldSAST, CWE, SEI CERT and other smart words from the information security world
SAST, CWE, SEI CERT and other smart words from the information security world
 
The Little Unicorn That Could
The Little Unicorn That CouldThe Little Unicorn That Could
The Little Unicorn That Could
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
 
Automatic Fine-Grained Issue Report Reclassification
Automatic Fine-Grained Issue Report ReclassificationAutomatic Fine-Grained Issue Report Reclassification
Automatic Fine-Grained Issue Report Reclassification
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Of Bugs and Men
Of Bugs and MenOf Bugs and Men
Of Bugs and Men
 
Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)Of Bugs and Men (and Plugins too)
Of Bugs and Men (and Plugins too)
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 StreamsSafe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
Integrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data MiningIntegrating R with the CDK: Enhanced Chemical Data Mining
Integrating R with the CDK: Enhanced Chemical Data Mining
 
Update from android kk to android l
Update from android kk to android lUpdate from android kk to android l
Update from android kk to android l
 
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
Machine-learning based performance heuristics for Runtime CPU/GPU Selection i...
 

Recently uploaded

4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 

Recently uploaded (20)

4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 

Keynote HotSWUp 2012

  • 1. Will my system run (correctly) after the upgrade? Martin Pinzger Assistant Professor Delft University of Technology
  • 2. Martin’s upgrades Assistant Professor PhD Postdoc Pfunds 2
  • 3. My Experience with Software Upgrades 3
  • 4. 4
  • 5. 5
  • 6. Bugs on upgrades get reported 6
  • 7. Hmm, wait a minute Can’t we learn “something” from that data? 7
  • 8. Software repository mining for preventing upgrade failures Martin Pinzger Assistant Professor Delft University of Technology
  • 9. Goal of software repository mining Making the information stored in software repositories available to software developers Quality analysis and defect prediction Recommender systems ... 9
  • 11. Examples from my mining research Predicting failure-prone source files using changes (MSR 2011) The relationship between developer contributions and failures (FSE 2008) There are many more studies MSR 2012 http://2012.msrconf.org/ A survey and taxonomy of approaches for mining software repositories in the context of software evolution, Kagdi et al. 2007 11
  • 12. Using Fine-Grained Source Code Changes for Bug Prediction Joint work with Emanuel Giger, Harald Gall University of Zurich
  • 13. Bug prediction Goal Train models to predict the bug-prone source files of the next release How Using product measures, process measures, organizational measures with machine learning techniques Many existing studies on building prediction models Moser et al., Nagappan et al., Zimmermann et al., Hassan et al., etc. Process measures performed particularly well 13
  • 14. Classical change measures Number of file revisions Code Churn aka lines added/deleted/changed Research question of this study: Can we further improve these models? 14
  • 15. Revisions are coarse grained What did change in a revision? 15
  • 16. Code Churn can be imprecise Extra changes not relevant for locating bugs 16
  • 17. Fine Grained-Source Code Changes (SCC) Account.java 1.5 Account.java 1.6 "balance > 0 && amount <= balance" IF "balance > 0" IF THEN THEN ELSE MI MI MI notify(); "withDraw(amount);" "withDraw(amount);" 3 SCC: 1x condition change, 1x else-part insert, 1x invocation statement insert 17
  • 18. Research hypotheses H1 SCC is correlated with the number of bugs in source files H2 SCC is a predictor for bug-prone source files (and outperforms LM) H3 SCC is a predictor for the number of bugs in source files (and outperforms LM) 18
  • 19. 15 Eclipse plug-ins Data >850’000 fine-grained source code changes (SCC) >10’000 files >9’700’000 lines modified (LM) >9 years of development history ..... and a lot of bugs referenced in commit messages 19
  • 20. on parametric Spearman rank correlation of Table 5: N nd SCC . * is correlated with #bugsat H1: SCC marks significant correlations and cate rger values are printed bold. = 0.01 Eclipse Project LM SCC Eclipse Pr Compare 0.68 0.76 Compare jFace 0.74 0.71 jFace JDT Debug 0.62 0.8 Resource Resource 0.75 0.86 Team Cor Runtime 0.66 0.79 CVS Core Team Core 0.15 0.66 Debug Co CVS Core 0.60 0.79 Runtime Debug Core 0.63 0.78 JDT Debu jFace Text 0.75 0.74 jFace Text Update Core 0.43 0.62 JDT Debu Debug UI 0.56 0.81 Update C +/-0.5 substantial Debug UI JDT Debug UI 0.80 0.81 Help 0.54 0.48 +/-0.7 strong Help JDT Core 0.70 0.74 OSGI OSGI 0.70 0.77 *significant JDT Core Median 0.66 0.77 correlation at 0.01Mean 20
  • 21. calculate and assign a probability to a file if it is bug-prone or not bug-prone. bug-prone files Predicting For each Eclipse project we binned files into bug-prone and not bug-prone using the median of the number of bugs per file Bug-prone vs. not bug-prone (#bugs): ⇢ not bug prone : #bugs <= median bugClass = bug prone : #bugs > median When using the median as cut point the labeling of a file is relative to how much bugs other files have in a project. There exist several ways of binning files afore. They mainly vary in that they result in different prior probabilities: For instance Zimmerman et al. [40] and Bernstein et al. [4] labeled files as bug-prone if they had at least one bug. When having heavily skewed distributions this approach may lead to high a prior probability towards a one class. Nagappan et al. [28] used a 21
  • 22. UC values of E 1 using logistic regression with CC as predictors for bug-prone and a notfiles H2: SCC can predict bug-prone bug- Larger values are printed in bold. Eclipse Project AUC LM AUC SCC Compare 0.84 0.85 jFace 0.90 0.90 JDT Debug 0.83 0.95 Resource 0.87 0.93 Runtime 0.83 0.91 Team Core 0.62 0.87 CVS Core 0.80 0.90 Debug Core 0.86 0.94 SCC outperforms LM jFace Text 0.87 0.87 Update Core 0.78 0.85 Debug UI 0.85 0.93 JDT Debug UI 0.90 0.91 Help 0.75 0.70 JDT Core 0.86 0.87 OSGI 0.88 0.88 Median 0.85 0.90 Overall 0.85 0.89 22
  • 23. Predicting the number of bugs Non linear regression with asymptotic model: Team Core 60 #Bugs 40 20 f(#Bugs) = a1 + b2*eb3*SCC 0 0 1000 2000 3000 4000 23 #SCC
  • 24. 1.50 Table 8: Results of predict the number of of R H3: SCC can the nonlinear regression in terms bugs 2 and Spearman correlation using LM and SCC as predictors. 1.00 nrm. Residuals Project R2 LM R2 SCC SpearmanLM SpearmanSCC .50 Compare 0.84 0.88 0.68 0.76 jFace 0.74 0.79 0.74 0.71 .00 JDT Debug 0.69 0.68 0.62 0.8 Resource 0.81 0.85 0.75 0.86 -.50 Runtime 0.69 0.72 0.66 0.79 Team Core 0.26 0.53 0.15 0.66 -1.00 CVS Core 0.76 0.83 0.62 0.79 Debug Core 0.88 0.92 0.63 0.78 Jface Text 0.83 0.89 0.75 0.74 6,000.0 Update Core 0.41 0.48 0.43 0.62 Debug UI 0.7 0.79 0.56 0.81 5,000.0 JDT Debug UI 0.82 0.82 0.8 0.81 Help 0.66 0.67 0.54 0.84 4,000.0 JDT Core 0.69 0.77 0.7 0.74 OSGI 0.51 0.8 0.74 0.77 3,000.0 Median 0.7 0.79 0.66 0.77 Overall 0.65 0.72 0.62 0.74 2,000.0 SCC outperforms LM 1,000.0 4 2
  • 25. Summary of results SCC performs significantly better than LM Advanced learners are not always better Change types do not yield extra discriminatory power Predicting the number of bugs is “possible” More information “Comparing Fine-Grained Source Code Changes And Code Churn For Bug Prediction”, MSR 2011 25
  • 26. What is next? Analysis of the effect(s) of changes What is the effect on the design? What is the effect on the quality? Ease understanding of changes Recommender techniques Models that can provide feedback on the effects 26
  • 27. 27
  • 28. Can developer-module networks predict failures? Joint work with Nachi Nagappan, Brendan Murphy Microsoft Research
  • 29. Research question Are binaries with fragmented contributions from many developers more likely to have post-release failures? Should developers focus on one thing? 29
  • 30. Study with MS Vista project Data Released in January, 2007 > 4 years of development Several thousand developers Several thousand binaries (*.exe, *.dll) Several millions of commits 30
  • 31. Approach in a nutshell Fu Alice 6 6 Change Eric b Bob a Go 5 2 4 2 Logs 4 5 7 Dan Hin c 4 Binary #bugs #centrality a 12 0.9 Bugs b 7 0.5 c 3 0.2 Regression Analysis Validation with data splitting 31
  • 32. Contribution network Windows binary (*.dll) Developer Which binary is failure-prone? 32
  • 33. Measuring fragmentation Freeman degree Closeness Bonacich’s power 33
  • 34. Research hypotheses Binaries with fragmented contributions H1 are failure-prone Fragmentation correlates positively with H2 the number of post-release failures Advanced fragmentation measures H3 improve failure estimation 34
  • 35. Correlation analysis Spearman rank correlation nrCommits nrAuthors Power dPower Closeness Reach Betweenness Failures 0,700 0,699 0,692 0,740 0,747 0,746 0,503 nrCommits 0,704 0,996 0,773 0,748 0,732 0,466 nrAuthors 0,683 0,981 0,914 0,944 0,830 Power 0,756 0,732 0,714 0,439 dPower 0,943 0,964 0,772 Closeness 0,990 0,738 Reach 0,773 All correlations are significant at the 0.01 level (2-tailed) 35
  • 36. H1: Predicting failure-prone binaries Binary logistic regression of 50 random splits 4 principal components from 7 centrality measures Precision Recall AUC 1.00 1.00 1.00 0.90 0.90 0.90 0.80 0.80 0.80 0.70 0.70 0.70 0.60 0.60 0.60 0.50 0.50 0.50 0 20 40 0 20 40 0 20 40 36
  • 37. H2: Predicting the number of failures Linear regression of 50 random splits #Failures = b0 + b1*nCloseness + b2*nrAuthors + b3*nrCommits R-Square Pearson Spearman 1.00 1.00 1.00 0.90 0.90 0.90 0.80 0.80 0.80 0.70 0.70 0.70 0.60 0.60 0.60 0.50 0.50 0.50 0 20 40 0 20 40 0 20 40 All correlations are significant at the 0.01 level (2-tailed) 37
  • 38. H3: Basic vs. advanced measures Model with nrAuthors, Model with nCloseness, nrCommits nrAuthors, nrCommits 1.00 1.00 R-Square 0.90 0.90 0.80 0.80 0.70 0.70 0.60 0.60 0.50 0.50 0.40 0.40 0.30 0.30 0 20 40 0 20 40 Spearman 1.00 1.00 0.90 0.90 0.80 0.80 0.70 0.70 0.60 0.60 0.50 0.50 0.40 0.40 0.30 0.30 0 20 40 0 20 40 38
  • 39. Summary of results Centrality measures can predict more than 83% of failure- pone Vista binaries Closeness, nrAuthors, and nrCommits can predict the number of post-release failures Closeness or Reach can improve prediction of the number of post-release failures by 32% More information Can Developer-Module Networks Predict Failures?, FSE 2008 39
  • 40. What can we learn from that? 6 6 5 2 4 2 4 5 7 4 Increase testing effort for central binaries? - yes Re-factor central binaries? - maybe Re-organize contributions? - maybe 40
  • 41. What is next? Analysis of the contributions of a developer Who is working on which parts of the system? What exactly is the contribution of a developer? Who is introducing bugs/smells and how can we avoid it? Global distributed software engineering What are the contributions of teams, smells and how to avoid it? Can we empirically prove Conway’s Law? Expert recommendation Whom to ask for advice on a piece of code? 41
  • 42. Ideas for software upgrade research 1. Mining software repositories to identify the upgrade-critical components What are the characteristics of such components? Product and process measures What are the characteristics of the target environments? Hardware, operating system, configuration Train a model with these characteristics and reported bugs 42
  • 43. Further ideas for research Who is upgrading which applications when? Study upgrade behavior of users? What is the environment of the users when they upgrade? Where did it work, where did it fail? Collect crash reports for software upgrades? Upgrades in distributed applications? Finding the optimal time when to upgrade which component? 43
  • 44. Conclusions Team Core 60 6 6 #Bugs 40 5 2 4 2 4 5 7 20 4 0 0 1000 2000 3000 4000 #SCC Questions? Martin Pinzger m.pinzger@tudelft.nl 44