4. MY THESIS .
additions analysis architecture archives aspects bug cached calls
changes collaboration complexities component concerns cross-
cutting cvs data defects design development drawing dynamine
eclipse effort evolves failures fine-grained fix fix-inducing
graphs hatari history locate matching method mining
predicting program programmers report repositories
revision software support system taking transactions
version visualizing
5. Contributions of the thesis
Fine-grained analysis of version archives. 1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)
Mining bug databases to predict defects. 2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).
6. Fine-grained analysis
public void createPartControl(Composite parent) {
...
// add listener for editor page activation
getSite().getPage().addPartListener(partListener);
}
public void dispose() {
...
getSite().getPage().removePartListener(partListener);
}
7. Fine-grained analysis
public void createPartControl(Composite parent) {
...
// add listener for editor page activation
getSite().getPage().addPartListener(partListener);
}
public void dispose() { co-added
...
getSite().getPage().removePartListener(partListener);
}
8. Fine-grained analysis
public void createPartControl(Composite parent) {
... close
// add listener for editor page activation open
getSite().getPage().addPartListener(partListener); println
}
public void dispose() { co-added
...
getSite().getPage().removePartListener(partListener);
} begin
Co-added items = patterns
9. Fine-grained analysis
public static final native void _XFree(int address);
public static final void XFree(int /*long*/ address) {
lock.lock();
try {
_XFree(address);
} finally {
lock.unlock();
}
}
D IN
N GE I O N S
CHA CAT
1284 LO
Crosscutting changes = aspect candidates
10. Contributions of the thesis
Fine-grained analysis of version archives. 1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)
Mining bug databases to predict defects. 2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).
13. Spent resources on the
components that need it most,
i.e., are most likely to fail.
14. Indicators of defects
Code complexity Code churn
Complex Code is more Changes are likely to
prone to defects. introduce new defects.
History Dependencies
Code with past defects is Using compiler packages
more likely to have future is more difficult than using
defects, packages for UI.
16. Hypotheses
Complexity of dependency graphs Sub
system
correlates with the number of post-release defects (H1) level
can predict the number of post-release defects (H2)
Network measures on dependency graphs Binary
correlate with the number of post-release defects (H3) level
can predict the number of post-release defects (H4)
can indicate critical “escrow” binaries (H5)
18. Data collection
six months
Release point for
to collect
Windows Server 2003
defects
Dependencies
Network Measures
Complexity Metrics Defects
19. Centrality
Degree Closeness Betweenness
Blue binary has dependencies Blue binary is close to all other Blue binary connects the left
to many other binaries binaries (only two steps) with the right graph (bridge)
20. Centrality
• Degreethe number dependencies
centrality
-
counts
• Closeness centrality binaries into account
-
takes distance to all other
- Closeness: How close are the other binaries?
- Reach: How many binaries can be reached (weighted)?
- Eigenvector: similar to Pagerank
• Betweenness centrality paths through a binary
-
counts the number of shortest
21. Complexity metrics
Group Metrics Aggregation
Module metrics # functions in B
for a binary B # global variables in B
# executable lines in f()
# parameters in f()
Per-function metrics Total
# functions calling f()
for a function f() Max
# functions called by f()
McCabe’s cyclomatic complexity of f()
# methods in C
# subclasses of C
OO metrics Total
Depth of C in the inheritance tree
for a class C Max
Coupling between classes
Cyclic coupling between classes
28. Classification
(logistic regression)
SNA increases the recall by 0.10 (at p=0.01)
while precision remains comparable.
29. Ranking
(linear regression)
SNA+METRICS increases the correlation
by 0.10 (significant at p=0.01)
30. FUTURE WORK .
bug cached calls
bug changes collaboration
additions analysis architecture archives aspects
analysis archives aspects
changes collaboration complexities component concerns cross-
complexities component concerns cross-cutting cvs data defects
cutting cvs data defects design development drawing dynamine
design development drawing eclipse erose evolves factor
eclipse effort evolvesfix-inducing fine-grained fix fix-inducing
failures fine-grained fix
failures
fm graphs guide hatari
graphs hatari history locate matching method mining
history human matching mining networking
predicting program programmers report repositories
predicting program programmers system report repositories
revision software support
quality
taking transactions
revision social software support system taking version
version visualizing
34. Contributions of the thesis
Fine-grained analysis of version archives. 1
Project-specific usage patterns of methods (FSE 2005)
Identification of cross-cutting changes (ASE 2006)
Mining bug databases to predict defects. 2
Dependencies predict defects (ISSRE 2007, ICSE 2008)
Domino effect: depending on defect-prone binaries increases
the chances of having defects (Software Evolution 2008).