Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Bug Prediction and Analysis

7.791 Aufrufe

Veröffentlicht am

Veröffentlicht in: Bildung, Technologie, Unterhaltung & Humor

Bug Prediction and Analysis

  1. 1. Bug Prediction & Analysis Marco D’Ambros
  2. 2. As users, we are used to bugs...
  3. 3. ... and also as developers
  4. 4. But the perception in reverse engineering is different
  5. 5. But the perception in reverse engineering is different There are thousands of bugs
  6. 6. Prediction
  7. 7. Focus resources on bug-prone components Theory Prove correlations Practice with software metrics Rank components according to the bug-proneness
  8. 8. Classification Class A will/won't Release x Bug prediction have bugs Ranking Class A will have more bugs than class B
  9. 9. Classification Class A will/won't Release x Bug prediction have bugs Correct? Ranking Class A will have more bugs than class B
  10. 10. Release x Bug prediction
  11. 11. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
  12. 12. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Svn / Cvs repository Check out Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
  13. 13. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Svn / Cvs repository Check out Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
  14. 14. System release Parsing FAMIX Class Attribute Attribute Attribute check out Svn / Cvs Class / File repository Versioning link Inferred system logs link log Parsing Commit comments Bug reference Bug reports in the comment Bugzilla Query Parsing database Bug
  15. 15. Classification Ranking Precision & recall Spearman correlation coefficient
  16. 16. Classification Ranking Precision & recall Spearman correlation coefficient Buggy classes Classes predicted as buggy
  17. 17. Classification Ranking Precision & recall Spearman correlation coefficient Buggy classes FN TP FP Classes predicted as buggy
  18. 18. Classification Ranking Precision & recall Spearman correlation coefficient How How small FP is small FN is Buggy classes FN TP FP Classes predicted as buggy
  19. 19. Classification Ranking Precision & recall Spearman correlation coefficient How How small FP is small FN is Predicted Observed Class D Class E Buggy classes Class A Class A FN TP Class E ... ~ Class D ... FP ... ... Classes predicted ... ... as buggy
  20. 20. Approaches are based on: History Metrics
  21. 21. Predicting Defects for Eclipse Thomas Zimmermann Rahul Premraj Andreas Zeller  Saarland University
  22. 22. Experimental settings Release #Files #Packages 2.0 6740 376 2.1 7900 433 3.0 6614 429 Pre-release defects Post-release defects 6 months before/after release
  23. 23. Classification of classes Using logistic regression models max recall 0.38 Buggy classes FN max precision 0.68 TP FP Classes predicted as buggy
  24. 24. Ranking classes McCabe complexity 0.401 Method LOC 0.405 Total LOC 0.42 Linear regression model 0.416 Pre-release defects 0 0.25 0.50 0.75 1.00
  25. 25. Ranking classes McCabe complexity 0.401 Method LOC 0.405 Total LOC 0.42 Linear regression model 0.416 Pre-release defects Pre-release defects 0.907 0 0.25 0.50 0.75 1.00
  26. 26. Conclusion Past defects is the predictor for future defects
  27. 27. Conclusion Software metrics Past defects is the correlate with defects but predictor for future are not usable in practice defects
  28. 28. Mining metrics to predict component failures Nachiappan Nagappan Thomas Ball Microsoft Research Andreas Zeller  Saarland University
  29. 29. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module
  30. 30. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module (a binary file within Windows)
  31. 31. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module (a binary file A set of classes within Windows)
  32. 32. Q1 Do complexity metrics correlate with defects?
  33. 33. Q1 Do complexity metrics correlate with defects? Maximum correlation Percentage of correlated metrics 1.00 0.75 0.50 0.25 0 A B C D E
  34. 34. Q2 Is there a unique set of metrics that predicts defects in all projets?
  35. 35. Q3 Can we combine metrics to predict defect?
  36. 36. Q3 Can we combine metrics to predict defect? Multicollinearity of metrics
  37. 37. Q3 Can we combine metrics to predict defect? Principal Multicollinearity Component of metrics analysis
  38. 38. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model
  39. 39. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model Spearman/Pearson correlation Percentage of splits which correlate 1.00 0.75 0.50 0.25 0 A B C D E
  40. 40. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model Spearman/Pearson correlation Percentage of splits which correlate Too few samples 1.00 0.75 0.50 0.25 0 A B C D E
  41. 41. Q4 Are predictors obtained from one project applicable to other projects?
  42. 42. Conclusion Metrics can be used to predict defects
  43. 43. Conclusion Metrics can be used to predict defects but
  44. 44. Conclusion Metrics can be used to predict defects but they must be validated on the history
  45. 45. Improving Defect Prediction Using Temporal Features and Non Linear Models Abraham Bernstein Jayalath Ekanayake Martin Pinzger University of Zurich
  46. 46. Experimental settings Plugin #Years #Files updateui 7 757 updatecore 7 459 search 6.5 540 pdeui 6.5 1621 pdebuild 6 198 compare 6.5 315 Non linear models based on 21 historical metrics + LOC
  47. 47. Classification of files Using decision tree learners All files: A Size(CC) Accuracy = Size(A) Correctly classified files: CC
  48. 48. Classification of files Using decision tree learners All files: A Size(CC) Accuracy = Size(A) Correctly classified files: CC Best predictor (7 metrics) Accuracy 99.16%
  49. 49. Ranking of files Using m5 tree regression algorithm Sperman correlation Predictor based on 7 metrics 0.966 Zimmermann’s pre-release defects 0.907 0 0.243 0.485 0.728 0.970
  50. 50. Conclusion Defect prediction can be improved with: Historical information Non-linear function
  51. 51. Predicting Faults Using the Complexity of Code Changes Ahmed E. Hassan Queen’s University
  52. 52. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
  53. 53. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C =
  54. 54. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C =
  55. 55. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C = - 2 4 * log2 4 2
  56. 56. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C = - 2 4 * log2 4 - 1 4 * log2 4 2 1
  57. 57. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B 1 File C File A 4 Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C 1 1 = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 2
  58. 58. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B 1 File C File A 4 Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C 1 1 = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 = 1 2
  59. 59. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that H > 1? k changes during H=1 the file File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
  60. 60. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 H=1 H > 1? where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
  61. 61. ned as:in the last six months). file juse H ,entropy F modified Complexity Metric (HCM) of a c ∗ the To as j∈ i Historyas bug predictor, Hassan  of Complexity Metric (HCM) e change HCP F (j) = X defined the i (j) = ij i History mplexity Metric {a,..,b} of a file j 0, ij ∗ i (j) , otherw HCM (HCM) asc HCP F H i j∈F (3) HCP Fi (j) = X i∈{a,..,b} HCM{a,..,b} (j) = 0, HCP Fi (j) other (3) e i is a.., b} is a set of evolution periods iand HCP the here {a, period with entropy H ,Set i is F is i∈{a,..,b} F of efined as: {a, b} period i and j periods andHmodified filesto re i..,is is a set of with ∗ is ,a j ∈ F HCPFiisis n the a periodevolutionentropy belongingth  file i , F cij Hi i e definition of icij , there otherwise din theHCP Fi (j) = and j is a file belonging as: period  0, are three types (4) cij ∗ Hi , j ∈ Fi he definition ofentropy there are three mod- i is a Fi (j) with0, cij , Hotherwise set of files typ here HCP period= , Fi is the (4) (1) the period i and jHis Mfilebelonging to Fentropy of co ed in cij = 1, everya file modifiedi .in the C Each file gets the According i oi the definition ofentropy Hiarei three types of HCM :the c iisgets ij with1,ijevery,of the systemmod-the a period = entropy the is the set of files in the c , there file modified in F system (1) c i and j is a file belonging to F . According n the period i interval. 1,This file modified approach: HCM definition of cijevery defines types ofconsidered in th 1. (1) cij = , entropy of the system period i gets the there areMthree in the HCM i gets the entropy of C system in the considered its H the Each file is weighted with time W defines considered period 1)interval. This approach HCM. cij = 1, every file modified in the approach HCM interval. This defines (2) the entropyjof the system in the consideredmodified gets cij = p , each modified being gets the probability of file time
  62. 62. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc over time in a respectively linear and logarithmic fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
  63. 63. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc overExponentially decayed time in a respectively linear and logarithmic fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
  64. 64. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc overExponentially decayed time in a respectively linear and logarithmic factor Exponential fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
  65. 65. Experimental settings System Start date #Subsystem NetBSD March 1993 235 FreeBSD June 1993 152 OpenBSD Oct 1995 265 Postgre July 1996 280 KDE April 1997 108 KOffice April 1998 158 Entropy metrics Number of past modifications Number of past defects Subsystem level
  66. 66. 2 Models fitting in terms of R Past defects Past changes HCM WHCM EDHCM 0 0.2 0.4 0.6 NetBSD FreeBSD OpenBSD Postgres KDE KOffice
  67. 67. Prediction error Number of past changes vs Entropy NetBSD FreeBSD OpenBSD Postgres KDE KOffice 0 12.5 25.0 37.5 #Changes - WHCM (%) #Changes - EDHCM (%)
  68. 68. Prediction error Number of past defects vs Entropy NetBSD FreeBSD OpenBSD Postgres KDE KOffice -20.0 -10.0 0 10.0 20.0 30.0 40.0 #Defects - WHCM (%) #Defects - EDHCM (%)
  69. 69. Conclusion Models based on entropy of changes are better defects predictor s than number o f past changes or defects
  70. 70. Conclusion Models based on entropy of changes are better defects predictor s than number o f past changes or defects A complex code change process negatively affects its product, the software system
  71. 71. Epilogue
  72. 72. Epilogue Defect prediction research has been active for several year A large number of scientific papers have been published
  73. 73. Epilogue We can predict defects but results have still limited practical usability
  74. 74. Epilogue Predicting bugs is very difficult because developing code is a human activity
  75. 75. Epilogue A human activity influenced by too many factors How complex was the piece of code? How tested? How experienced was the developer?
  76. 76. Epilogue A human activity influenced by too many factors How complex was the piece of code? How tested? How experienced was the developer? How tired was the developer? How integrated was the developer in the team? Did he like his job?
  77. 77. Epilogue A human activity influenced by too many factors F OC US How complex was the piece of code? How tested? How experienced was the developer? How tired was the developer? How integrated was the developer in the team? Did he like his job?
  78. 78. Epilogue A human activity influenced by too many factors F OC US How complex was the piece of code? How tested? How experienced was the developer? od Hata ow tired was the developer? N y etintegrated was the developer in the team? How Did he like his job?
  79. 79. Analysis
  80. 80. Detect the critical bugs properties of components number of bugs
  81. 81. Detect the critical bugs properties of components number of bugs
  82. 82. Detect the critical components number of bugs properties of bugs
  83. 83. bugzero bugzilla census customerfirst defect-agent extraview-bug-tracker fast- bugtrack fogbugz gnats ibm- rational-clearquest ictracker issue- organizer issuenet-intercept issueview jira legendsoft-spots mantis new-fire omnitracker pointinsight pr-tracker problemtracker quickbugs radar razor rmtrack-bug-tracking
  84. 84. 4 facts about bugs
  85. 85. Bugs are differently harmful Blocker Critical Major Normal Minor Trivial Enhancement
  86. 86. Bugs are differently harmful Blocker Critical Bugzil la is used to repor t Major gs buNormal and change requests Minor Trivial Enhancement
  87. 87. Bugs are differently harmful Blocker Critical Bugzil la is used to repor t Major gs buNormal and change requests Minor Trivial Enhancement
  88. 88. Bugs are graphs
  89. 89. Bugs evolve
  90. 90. An ideal bug life cycle Unconfirmed
  91. 91. An ideal bug life cycle Unconfirmed Verified New Resolved Closed Assigned
  92. 92. A bit less ideal Unconfirmed Verified New Resolved Closed Assigned
  93. 93. A bit less ideal Unconfirmed Verified New Resolved Closed Assigned Reopened
  94. 94. The reality Unconfirmed Verified New Resolved Closed Assigned Reopened
  95. 95. The reality Unconfirmed Verified New Resolved Closed Assigned Reopened
  96. 96. All bug properties can change over time Bug Problem id description product component Criticality severity priority Involved people assignedTo reporter qa State Status Resolution ...
  97. 97. All bug properties can change over time Bug Bug Problem Problem id description id description product component product component Criticality Activity Criticality severity priority severity priority Involved people Involved people steve assignedTo reporter qa AssignedTo mike assignedTo reporter qa State steve john State Status Resolution Status Resolution ... ...
  98. 98. All bug properties can change over time Bug Bug Problem Problem id description id description product component product component Criticality Activity Criticality severity priority severity priority Involved people Involved people steve assignedTo reporter qa AssignedTo mike assignedTo reporter qa State steve john State Status Resolution Status Resolution ... ... i B P de i B P de i B P de i B P de Bug history C C C C Inv Inv Inv Inv S SR S SR S SR S SR
  99. 99. Are there many activities? How long do they live?
  100. 100. Are there many activities? How long do they live? Time period Sep 1998 - Apr 2003 #Bugs 255’302 #Activities 2’706’201
  101. 101. Number of activities 30% 25% 20% 15% 10% 5% 0% 0 1-3 4-5 6-10 11-20 21-30 > 30
  102. 102. Lifetime (reported - last activity) 40% 32% 24% 16% 8% 0% 12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More
  103. 103. Lifetime (reported - last activity) 40% 32% > 50% 24% 16% 8% 0% 12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More
  104. 104. Bugs have long and intense lives
  105. 105. 4 facts about bugs are are evolves have differently graphs long and harmful intense lives
  106. 106. There is a need of analyzing bug repositories Analyzing bugs as evolving entities
  107. 107. “A Bug’s Life” Visualizing a Bug Database Marco D’Ambros Michele Lanza Martin Pinzger
  108. 108. System radiography view “Where (in the system and in its history) are the open bugs located?”
  109. 109. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 •Product :: Component Product B Time
  110. 110. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 y position Color #bugs •Product :: Component • (x,y) : (time, component) Component Product B x position • Color: # open bugs Time Interval Time
  111. 111. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 y position Color #bugs •Product :: Component • (x,y) : (time, component) Component Product B x position • Color: # open bugs Time Interval Time
  112. 112. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato
  113. 113. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser
  114. 114. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser Mailnews
  115. 115. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser Mailnews
  116. 116. The Bug Watch View “How are bugs characterized with respect to their history?”
  117. 117. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 Time
  118. 118. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time
  119. 119. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status
  120. 120. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  121. 121. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  122. 122. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  123. 123. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  124. 124. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
  125. 125. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ... • Activity
  126. 126. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ... • Activity • Severity
  127. 127. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03]
  128. 128. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03] Reopened 4 times Developer in charge to fix it changed 6 times Many people added in the CC
  129. 129. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03]
  130. 130. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03] One status but many activities (addition of CC)
  131. 131. Conclusion Analyzing a bug database Provides useful insights in a software system Helps in detecting the most harmful bugs
  132. 132. Epilogue
  133. 133. Epilogue We are just touching the surface The analysis of bug repositories is still a very open field

×