08448380779 Call Girls In Friends Colony Women Seeking Men
Of Bugs and Men
1. Of Bugs and Men
(and Plugins too)
Michel Wermelinger, Yijun Yu
The Open University, UK
Markus Strohmaier
Technical University Graz, Austria
2. Plugins
Working Conf. on
Mining Softw. Repositories 2008
Int’l Conf. on Softw. Maintenance 2008
3. Motivation & Method
What is the validity, generality and usefulness
of design principles?
Study long-term evolution
Study architectural evolution
Study complex systems
Case study: Eclipse
modern CBS with reusable, extensible
components
4. Eclipse
Static dependency: X depends on Y
Dynamic dependency:
X uses extension points provided by Y
Self-cycles possible
We analysed whole Eclipse SDK (JDT, PDE, etc)
5. Eclipse releases
Various types of releases
Major (e.g. 3.1) and maintenance releases (e.g. 3.1.1)
Milestones (3.2M1) and Release candidates (3.2RC1)
Maintenance of current major release in parallel with milestones
and release candidates of next one
We analysed
20 major and maintenance releases over 6 years (1.0 to
3.3.1.1)
27 milestone and release candidates over 2 years (3.1 to 3.3)
grouped in 2 sequences: 1.0 – 3.1 and 3.1 – 3.3.1.1
7. Some Research Questions
Is there continuous growth (Lehman’s 6th
law)?
Is there any pattern (e.g. superlinear growth)?
Does complexity increase (Lehman’s 2nd
law)?
Is there any effort to reduce it?
Does coupling decrease?
Does cohesion increase?
8. Modules
A simple structural model
Module = directed graph
Elements = internal or external
Arcs = internal or external relations
External elements and arcs show context
For Eclipse SDK module
elements = plugins or external components
arcs = static and/or dynamic dependencies
9. Module measures
Size = # internal elements
NIP = number of internal plugins
Complexity = # internal arcs
NISD/NIDD = number of internal static/dynamic
dependencies
Cohesion = complexity / size
Coupling = # external arcs
NESD (NEDD is always zero)
10. Size Evolution (1)
Number of plugins kept, added, deleted w.r.t. previous release
Number kept since initial release → stable architectural core
Segmented growth
Overall 4- to 5-fold growth, but not superlinear
Many changes in 3.0; few deletions overall
11. Size Evolution (2)
Long equilibrium and short punctuation periods
Equilibrium: changes accommodated within current architecture
Punctuation: changes require architectural revisions
mostly in milestones
some in release candidates
hardly in maintenance
12. Architectural core
jdt.ui
jdt.launching
jdt.doc.isv jdt.doc.user pde.doc.user platform.doc.isv platform.doc.user help.ui pde.runtime ant.ui search compare pde.core debug.ui jdt.debug
help pde ui ant.core jdt.core debug.core
core.runtime swt core.resources
core with static and dynamic dependencies
self-cycles point to reuse of extension points
layered architecture
core is >40% of release 1.0 and ca. 10% of 3.3.1.1
13. Complexity Evolution
Charts show NISD (left) and NIDD (right)
Release 3.1 is major restructuring
Static dependencies decreased by 19%
Plugins increased by 57%
More deletions, i.e. effort to reduce complexity
14. Cohesion evolution (1)
Size (left) and complexity (right) grow in step
Two exceptions
Release 3.0 maintains size
Release 3.1 reduces complexity
15. Cohesion evolution (2)
Result: cohesion slightly decreases over time
Except for major increase during 3.0.* releases
Independently of static, dynamic, or both dependencies
Low cohesion: <3 (incoming or outgoing) dependencies per plugin
explicit effort to keep architecture loosely cohesive?
16. Coupling Evolution
Charts show NESD
Refactoring in 3.0:
All existing external dependencies removed via new internal
proxies
External component org.apache.xerces was removed
Overall, coupling is small compared to size and
complexity
17. Acyclic Dependency Principle
Dependency graph should be acyclic [Martin 96 and others]
decreases change propagation
eases release management and work allocation
Measured cycle length over joint dependency graph
Graph shows segmented growth of harmless self-cycles (length 1)
Single cycle with length > 1 was broken apart in release 3.0
18. Stable Dependency Principle
dependencies should be in direction of stability
[Martin 97]
changes propagate opposite to dependencies
if A depends on B, A can’t be harder to change than B
instability of element = fanout / (fanin + fanout)
irresponsible: fanin = 0, instability = 1, may change
independent: fanout = 0, instability = 0, no reason to
change
19. SDP Evolution
Charts show number of SDP violations
Absolute (left) and relative (right)
static, dynamic and both dependencies
Numbers kept low, with ratio tending to decrease
1-5% violations for static dependencies, 9-17% for dynamic
20. Changeability measures
slight adaptation of [van Belle 04]
likelihood of changing an element
# of actual changes / max possible #
impact of an element’s changes
avg # of elements changed with it
acuteness = impact / likelihood
high for interfaces, low for method bodies
21. Changes and Stability (1)
changes and stability are related
responsible elements: high change impact
independent elements: low change likelihood
stable elements: high change acuteness
van Belle: correlational linkage
implicit, from co-change observation
takes change propagation closure into account
Martin: causal linkage
must be given explicitly
only looks at immediate neighbours
22. Changes and Stability (2)
measured fanin/fanout of the 69 plugins in release 2.0
measured impact/likelihood of same plugins over next 45
releases
normalised measures, ordered plugins by fanin and fanout
lower fanin ⇒ less responsible ⇒ lower impact: not quite so
lower fanout ⇒ less dependent ⇒ lower likelihood: somewhat
23. Changes and Stability (3)
measured instability when defined (52 plugins in 2.0)
All but one irresponsible and independent plugins remained so over time
higher instability lower acuteness: mixed
some trend but many exceptions
likelihood vs independence is better than impact vs responsibility
static causal linkage can’t predict future correlational linkage
former only accounts for internal drives, latter includes external drives
24. Conclusions (1)
Successful evolution of Eclipse due to…?
systematic architectural change process
segmented growth of size and complexity
cohesion kept low; cycles removed
SDP violations and coupling reduced
significant stable layered architectural core
Some consistency between causal and
correlational changeability measures
25. Conclusions (2)
many design principles/guidelines proposed, but…
no empirical evidence of usefulness for maintenance
selected representative case study
large, complex, successful, component-based system
accurate architectural information + enough evolution history
generic and lightweight approach
no reverse engineering, no static code analysis
modules and changeability measures
flexible scripting tool manipulating text files with relational data
potential practical implications of findings
confirmed some laws and principles; observed some patterns
investigated static and historic changeability measures
26. Bugs and Men
New Ideas and Emerging Results track
of Int’l Conf. on Software Eng. 2009
27. Motivation
Software engineering is socio-technical activity
Global and open source software development led to
increased interest in and relevance of social aspects
Need for representing socio-technical relations
Bipartite graphs of software artefacts and people
Ad-hoc arc semantics, depending on relation
Ad-hoc flat layout, often hard to read
Relevant relations lost among many nodes and arcs
Sought improvements:
More compact, intuitive, and explicit representation
Distinguish ‘hierarchical’ importance of artefacts, people
and their relations.
28. General Approach
Obtain a bipartite socio-technical network
Compute socio-technical concept lattice
Apply formal concept analysis (FCA) theory
Use free tool ConExp (Concept Explorer)
Concept: clusters all artefacts associated to same
people
Hierarchy: partial ordering of clusters
Study different and evolving socio-technical
relations
Repeat for various relations and system releases
29. Case study
Requirements:
Should have non-trivial social and technical
structure
Should not have fluid social structure
Should provide different data sources (not just
code)
Eclipse
Has IBM lead and Bugzilla repository
30. The socio-technical network (1)
Build PBC network
P nodes: 16,025 people
B nodes: 101,966 Eclipse SDK bug reports
C nodes: 16 Eclipse SDK components
p-b arc: p reported/assigned to/discussed b
b-c arc: b is reported for c
Repeat for various releases and roles
31. The socio-technical network (2)
Build the PC network
Folding of PBC, i.e. p-c arc with weight b
person p is associated to b reports for
component c
Number of paths from p to c
Build the PC(k) network
Remove all arcs with weight < k
Remove all weight information
32. Formal Concept Analysis
Given objects O and attributes A and relation O × A
e.g. O = components, A = assignees
Concept c = (o ⊆ O, a ⊆ A)
each object in o has all attributes a
o is the extent and a is the intent of the concept
Hierarchy: (o, a) ≤ (o’, a’) if o ⊆ o’ (or a’ ⊆ a)
From top to bottom: extent decreases, intent increases
Socio-technical concept lattice
Usually, people at level n (bottom=0) associated to n components
‘specialists’ at lower, ‘generalists’ at upper levels
Each node includes all its ancestors’ people and all its
descendants’ components
33. Release 1.0, assignees, k=10
USA coordinating
2 Canadian teams?
only 4 ‘generalists’
(2 components each)
the French team
only 1 developer associated:
what if they leave project?
the Swiss team
most developers associated:
is this largest or most
complex component?
34. Release 3.0, assignees, k=100
only 2 ‘generalists’ Common developers:
(3 components each) highly dependent
components?
Used higher k because bug reports accumulate over time
Geographical and workload distribution like release 1.0
35. Release 3.0, discussants, k=100
Developers discuss more
components than they
are assigned to: due to
dependencies?
Developers don’t discuss all reports they are assigned to
36. Conclusions
Novel application of Formal Concept Analysis
Clustering and ordering of socio-technical relations
General tool-supported approach
Some advantages over bi-partite graphs
More scalable: not one node per person and artefact
More explicit: related people & artefacts in same node
More intuitive: uniform vertical layout & arc semantics
Helps spot expertise and potential problems
Generalist and specialist people
Artefacts with too many or too few people associated
Undesired or absent communication/coordination
37. Concluding conclusions
Software engineering is inherently socio-technical
endeavour
Availability of FLOSS projects allows to study
historical heterogeneous data
Used process and artefact data to present different
views on same case study
Evolution of architecture
Hierarchy of maintainers
Impact of dependencies
Opportunities for many studies, mining and
visualisation techniques that can help academics,
developers and managers