3. There are two ways of constructing
a software design.
“
4. There are two ways of constructing
a software design.
“
One way is to make it
so simple that there are
obviously no deficiencies
5. There are two ways of constructing
a software design.
“
One way is to make it
so simple that there are
obviously no deficiencies
and the other way is to make it so
complicated that there are no
obvious deficiencies.
6. “
The first method is far more difficult.
One way is to make it
so
obviously
7. “One way is to make it
so
obviously
The first method is far more difficult.
It demands the same skill, devotion, insight,
and even inspiration as the discovery of the
simple physical laws which underlie the
complex phenomena of nature.
8. “One way is to make it
so
obviously
The
It demands the same skill, devotion, insight,
and even inspiration as the discovery of the
simple physical laws which underlie the
complex phenomena of nature.
12. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 4
City Metaphor
13. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
The city metaphor
3
domain mapping
class building
package district
system city
City Metaphor
domain mapping
class building
14. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
The city metaphor
3
domain mapping
class building
package district
system city
City Metaphor
domain mapping
class building
package district
15. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 3
domain mapping
class building
package district
system city
City Metaphor
domain mapping
class building
package district
system city
16. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 3
domain mapping
class building
package district
system city
City Metaphor
domain mapping
class building
package district
system city
number of methods (NOM) height
number of attributes (NOA) base size
17. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 3
domain mapping
class building
package district
system city
City Metaphor
domain mapping
class building
package district
system city
number of methods (NOM) height
number of attributes (NOA) base size
nesting level color
18. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
ArgoUML
19. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
ArgoUML
Skyscrapers
20. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
ArgoUML
Skyscrapers
Parking
Lots
21. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
ArgoUML
Skyscrapers
Parking
Lots Office
Buildings
22. Richard Wettel and Michele Lanza Visual Exploration of Larg
ArgoUM
JHotDr
iText
Jmol
JDK 1.5
Moose
Jun
CodeC
ScumV
JDK 1.5
23. MooseRichard Wettel and Michele Lanza Visual Exploration of L
iText
Jmo
JDK
Moo
Jun
Cod
Scum
24.
25. MooseRichard Wettel and Michele Lanza Visual Exploration of L
iText
Jmo
JDK
Moo
Jun
Cod
Scum
26. Software Systems as Cities: A Controlled Experiment
Richard Wettel and Michele Lanza
REVEAL @ Faculty of Informatics
University of Lugano
{richard.wettel,michele.lanza}@usi.ch
Romain Robbes
PLEIAD @ DCC
University of Chile
rrobbes@dcc.uchile.cl
ABSTRACT
Software visualization is a popular program comprehension tech-
nique used in the context of software maintenance, reverse engineer-
ing, and software evolution analysis. While there is a broad range
of software visualization approaches, only few have been empiri-
cally evaluated. This is detrimental to the acceptance of software
visualization in both the academic and the industrial world.
We present a controlled experiment for the empirical evaluation
of a 3D software visualization approach based on a city metaphor
and implemented in a tool called CodeCity. The goal is to provide
experimental evidence of the viability of our approach in the context
of program comprehension by having subjects perform tasks related
to program comprehension. We designed our experiment based on
lessons extracted from the current body of research. We conducted
the experiment in four locations across three countries, involving
41 participants from both academia and industry. The experiment
shows that CodeCity leads to a statistically significant increase in
terms of task correctness and decrease in task completion time. We
detail the experiment we performed, discuss its results and reflect
on the many lessons learned.
It has earned a reputation as an effective program comprehension
technique, and is widely used in the context of software maintenance,
reverse engineering and reengineering [2].
The last decade has witnessed a proliferation of software visual-
ization approaches, aimed at supporting a broad range of software
engineering activities. As a consequence, there is a growing need for
objective assessments of visualization approaches to demonstrate
their effectiveness. Unfortunately, only few software visualization
approaches have been empirically validated so far, which is detri-
mental to the development of the field [2].
One of the reasons behind the shortage of empirical evaluation lies
in the difficulty of performing a controlled experiment for software
visualization. On the one hand, the variety of software visualization
approaches makes it almost impossible to reuse the design of an
experiment from the current body of research. On the other hand, or-
ganizing and conducting a proper controlled experiment is, in itself,
a difficult endeavor, which can fail in many ways: data which does
not support a true hypothesis, conclusions which are not statistically
significant due to insufficient data points, etc.
We present a controlled experiment for the empirical evaluation
of a software visualization approach based on a city metaphor [16].
The aim is to show that our approach, implemented in a tool called
27. Software Systems as Cities: A Controlled Experiment
Richard Wettel and Michele Lanza
REVEAL @ Faculty of Informatics
University of Lugano
{richard.wettel,michele.lanza}@usi.ch
Romain Robbes
PLEIAD @ DCC
University of Chile
rrobbes@dcc.uchile.cl
ABSTRACT
Software visualization is a popular program comprehension tech-
nique used in the context of software maintenance, reverse engineer-
ing, and software evolution analysis. While there is a broad range
of software visualization approaches, only few have been empiri-
cally evaluated. This is detrimental to the acceptance of software
visualization in both the academic and the industrial world.
We present a controlled experiment for the empirical evaluation
of a 3D software visualization approach based on a city metaphor
and implemented in a tool called CodeCity. The goal is to provide
experimental evidence of the viability of our approach in the context
of program comprehension by having subjects perform tasks related
to program comprehension. We designed our experiment based on
lessons extracted from the current body of research. We conducted
the experiment in four locations across three countries, involving
41 participants from both academia and industry. The experiment
shows that CodeCity leads to a statistically significant increase in
terms of task correctness and decrease in task completion time. We
detail the experiment we performed, discuss its results and reflect
on the many lessons learned.
It has earned a reputation as an effective program comprehension
technique, and is widely used in the context of software maintenance,
reverse engineering and reengineering [2].
The last decade has witnessed a proliferation of software visual-
ization approaches, aimed at supporting a broad range of software
engineering activities. As a consequence, there is a growing need for
objective assessments of visualization approaches to demonstrate
their effectiveness. Unfortunately, only few software visualization
approaches have been empirically validated so far, which is detri-
mental to the development of the field [2].
One of the reasons behind the shortage of empirical evaluation lies
in the difficulty of performing a controlled experiment for software
visualization. On the one hand, the variety of software visualization
approaches makes it almost impossible to reuse the design of an
experiment from the current body of research. On the other hand, or-
ganizing and conducting a proper controlled experiment is, in itself,
a difficult endeavor, which can fail in many ways: data which does
not support a true hypothesis, conclusions which are not statistically
significant due to insufficient data points, etc.
We present a controlled experiment for the empirical evaluation
of a software visualization approach based on a city metaphor [16].
The aim is to show that our approach, implemented in a tool called
solutions to the same task shows that the systems are quite different.
Experimenter effect. One of the experimenters is also the author
of the approach and of the tool. This may have influenced any
subjective aspect of the experiment. One instance of this threat
is that the task solutions may not have been graded correctly. To
mitigate this threat, the three authors built a model of the answers
and a grading scheme and then reached consensus. Moreover, the
grading was performed in a similar manner and two of the three
experimenters graded the solutions blinded, i.e., without knowing
the treatments (e.g., tool) used to obtain the solutions. Even if we
tried to mitigate this threat extensively, we cannot exclude all the
possible influences of this factor on the results of the experiment.
11. CONCLUSION
We presented a controlled experiment aimed at evaluating our
software visualization approach based on a city metaphor. We
designed our experiment from a list of desiderata built during an
extensive study of the current body of research. We recruited large
samples of our target population, with subjects from both academia
and industry with a broad range of experience.
The main result of the experiment is that our approach leads
to an improvement, in both correctness (+24%) and completion
time (-12%), over the state-of-the-practice exploration tools. This
result is statistically significant. We believe this is due to both the
visualization as such, as well as the metaphor used by CodeCity, but
we can not measure the exact contribution of each factor.
Apart from an aggregated analysis, we performed a detailed anal-
28.
29. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 12
Age Maps
Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Glimpse in the past
entities colored according to age
Principle of age map
11
1N
age (number of survived versions)
new-bornveteran
30. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Refining the granularity
13
NOM=7
NOA
=
2
NOA = 2
class C
class C
m4
m1
m3
m5
m7
m6
m2
old
new
Granularity
31. Granularity
Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 14
library packages:
java
javax
junit
org.w3c.dom
(classes) AllTests
CH.ifa.draw.standard
CH.ifa.draw.framework
CH.ifa.draw.figures
CH.ifa.draw.test
class DrawApplication
in CH.ifa.draw.application
class StandardDrawingView
in CH.ifa.draw.standard.
32. Granularity
Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution 15
age: 1 2 3 4 5 6 7 8
stable
youngoldvery old
rarely updated
highly unstable
very old
updated often,
rather unstable
46. Fig. 6. View of the city with all the activities
SC+ST
SC+IE
IE+ST
SC+ST+IE
PresentPast
SC
IE
ST
V. TELLING EVOLUTIONARY STORIES
To assess the usefulness of our approach we inspected the
recent history of the PHARO system. In the following we
present four stories that narrate the evolution of the system
supported by our blended view.
Multiple Concerns
72. Dominant Tracks
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
73. Dominant Tracks
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
- Single Track
- Multi Track
- Fragmented Track
74. Dominant Tracks
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
- Single Track
- Multi Track
- Fragmented Track
75. Dominant Tracks
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
- Single Track
- Multi Track
- Fragmented Track
76. Dominant Tracks Track Flow
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
Describes the way the
developer alternates from
different window tracks
- Single Track
- Multi Track
- Fragmented Track
77. Dominant Tracks Track Flow
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
Describes the way the
developer alternates from
different window tracks
- Sequential Flow
- Ping-Pong
- Single Track
- Multi Track
- Fragmented Track
78. Dominant Tracks Track Flow
A track with a privileged
role in the development.
i.e., with predominant focus time
and concentration of edit events.
Describes the way the
developer alternates from
different window tracks
- Sequential Flow
- Ping-Pong
- Single Track
- Multi Track
- Fragmented Track
82. >50% sessions
are Single-Track
>65% sessions
are Fragmented
All prefer
Sequential flow
Has 43% Ping-
Pong sessions
Dominant Tracks Track Flow
83. Program understanding:
Challenge for the 1990s
In the Program Understanding Project at IBM's Re-search Division, work began in late 1986 on toolswhich could help programmers in two key areas: staticanalysis (reading the code) and dynamic analysis (run-ning the code). The work is reported in the companionpapers by Cleveland and by Pazel in this issue. Thehistory and background which motivated and whichled to the start of this research on tools to assistprogrammers in understanding existing program codeis reported here.
" If the poor workman hates his tools, the goodworkman hates poor tools. The work of theworkingman is, in a sense, defined by his tool-witness the way in which the tool is so often taken tosymbolize the worker:the tri-squarefor the carpenter,the trowelfor the mason, the transit for the surveyor,the camera for the photographer, the hammerfor thelaborer, and the sickle for the farmer.
"Working with defective or poorly designed tools,even the finest craftsman is reduced to producinginferior work, and is thereby reduced to being aninferior craftsman. No craftsman, ifhe aspires to thehighest work in his profession, will accept such tools;and no employer, if he appreciates the quality ofwork, will ask the craftsman to accept them. "I
Today a variety of motivators are causing corpora-tions to invest in software tools to increase softwareproductivity, including: (1) increased demand forsoftware, (2) limited supply of software engineers, (3)rising expectations of support from software engi-neers, and (4) reduced hardware costs.' A key moti-
294 CORBI
by T. A. Corbi
vator for software tools in the 1990swill be the resultof having software evolve over the previous decadesfrom several-thousand-line, sequential programmingsystems into multimillion-line, multitasking "busi-ness-critical" systems. As the programming systemswritten in the 1960s and 1970s continue to mature,the focus for software tools will shift from tools thathelp develop new programming systems to tools thathelp us understand and enhance aging programmingsystems.
In the 1970s, the work of Belady and Lehman"?strongly suggested that all large programs wouldundergo significant change during the in-servicephase of their life cycle, regardless of the a prioriintentions of the organization. Clearly, they wereright. As an industry, we have continued to growand change our large software systems to:
• Remove defects
• Address new requirements
• Improve design and/or performance
• Interface to new programs
• Adjust to changes in data structures or formats• Exploit new hardware and software features
As we extended the lifetimes of our systems bycontinuing to modify and enhance them, we also
e Copyright 1989by International BusinessMachines Corporation.Copying in printed form for private use is permitted withoutpayment of royalty provided that (I) each reproduction is donewithout alteration and (2)the Journalreference and IBM copyrightnotice are included on the first page. The title and abstract, but noother portions, of this paper may be copied or distributed royaltyfree without further permission by computer-based and otherinformation-service systems. Permission to republish any otherportion of this paper must be obtained from the Editor.
…up to 50%
84. write
read
navigate
user interface
program
understanding
workman hates his tools, the good
hates poor tools. The work of the
in a sense, defined by his tool-
in which the tool is so often taken to
orker:the tri-squarefor the carpenter,
e mason, the transit for the surveyor,
he photographer, the hammerfor the
sickle for the farmer.
defective or poorly designed tools,
craftsman is reduced to producing
nd is thereby reduced to being an
n. No craftsman, ifhe aspires to the
is profession, will accept such tools;
r, if he appreciates the quality of
e craftsman to accept them. "I
of motivators are causing corpora-
software tools to increase software
uding: (1) increased demand for
ed supply of software engineers, (3)
s of support from software engi-
uced hardware costs.' A key moti-
hat all large programsundergo significant change during the inphase of their life cycle, regardless of the aintentions of the organization. Clearly, theright. As an industry, we have continued tand change our large software systems to:
• Remove defects
• Address new requirements
• Improve design and/or performance
• Interface to new programs
• Adjust to changes in data structures or form• Exploit new hardware and software feature
As we extended the lifetimes of our systecontinuing to modify and enhance them, w
e Copyright 1989by International BusinessMachines CorpoCopying in printed form for private use is permittedpayment of royalty provided that (I) each reproductionwithout alteration and (2)the Journalreference and IBM conotice are included on the first page. The title and abstract,other portions, of this paper may be copied or distributedfree without further permission by computer-based andinformation-service systems. Permission to republish anyportion of this paper must be obtained from the Editor.
85. I Know What You Did Last Summer
An Investigation of How Developers Spend Their Time
Roberto Minelli, Andrea Mocci and Michele Lanza
REVEAL @ Faculty of Informatics — University of Lugano, Switzerland
Abstract—Developing software is a complex mental activity,
requiring extensive technical knowledge and abstraction capa-
bilities. The tangible part of development is the use of tools to
read, inspect, edit, and manipulate source code, usually through
an IDE (integrated development environment). Common claims
about software development include that program comprehension
takes up half of the time of a developer, or that certain UI
(user interface) paradigms of IDEs offer insufficient support to
developers. Such claims are often based on anecdotal evidence,
throwing up the question of whether they can be corroborated
on more solid grounds.
We present an in-depth analysis of how developers spend their
time, based on a fine-grained IDE interaction dataset consisting
of ca. 740 development sessions by 18 developers, amounting to
200 hours of development time and 5 million of IDE events. We
propose an inference model of development activities to precisely
measure the time spent in editing, navigating and searching for
artifacts, interacting with the UI of the IDE, and performing
corollary activities, such as inspection and debugging. We report
several interesting findings which in part confirm and reinforce
some common claims, but also disconfirm other beliefs about
software development.
I. INTRODUCTION
Software development is a complex activity that requires
both technical knowledge and extensive abstraction capabili-
ties [1]. Even if the outcome of software development is code,
the development process is far from being just code writing
[2]. In fact, software systems are so large and complex [3]
While interacting with IDEs, developers generate a large
amount of data [13], [14]. These interactions happen at differ-
ent levels of abstraction: they can be conceptual events, like
adding a method to a class, or low level events like pressing
keys or moving the mouse to navigate between entities. While
researchers believe that this data is valuable [14], [15], most
of these interactions are not captured.
Our aim is to record interaction data and measure the time
effectively devoted to different activities. With this data, we
provide insights on the distribution of development activities
and we quantitatively answer questions like:
1) What is the effective time spent in program comprehen-
sion/understanding? What about the other activities like
editing, inspecting and navigating through code entities?
2) How much time do developers spend in fiddling with the
UI of an IDE?
3) What is the impact of the fragmentation of the develop-
ment flow?
We present an in-depth analysis on how developers spend
their time. We collected fine-grained interaction data using
DFLOW [16], our non-intrusive profiler for the PHARO IDE1
.
Our dataset consists of 740 development sessions of 18 devel-
opers. DFLOW collected about 200 hours of development time,
amounting to more than 5 million of IDE events. We propose
96. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Fine-grained age map: JHotDraw
14
library packages:
java
javax
junit
org.w3c.dom
(classes) AllTests
CH.ifa.draw.standard
CH.ifa.draw.framework
CH.ifa.draw.figures
CH.ifa.draw.test
class DrawApplication
in CH.ifa.draw.application
class StandardDrawingView
in CH.ifa.draw.standard.
97. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Fine-grained age map: JHotDraw
14
library packages:
java
javax
junit
org.w3c.dom
(classes) AllTests
CH.ifa.draw.standard
CH.ifa.draw.framework
CH.ifa.draw.figures
CH.ifa.draw.test
class DrawApplication
in CH.ifa.draw.application
class StandardDrawingView
in CH.ifa.draw.standard.
10:20 20:12
3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33
Fig. 6. View of the city with all the activities
SC+ST
SC+IE
IE+ST
SC+ST+IE
PresentPast
SC
IE
ST
Fig. 5. Aging Process: Example in the Timeline
In the “present” interval (i.e., the one selected by the user),
colors are at their default saturation. In the “past” interval,
instead, the color saturation fades. At the end of this interval,
the nodes have the default color, i.e., light gray.
V. TELLING EVOLUTIONARY STORIES
To assess the usefulness of our approach we inspected the
recent history of the PHARO system. In the following we
present four stories that narrate the evolution of the system
supported by our blended view.
A. Those Awkward Neighbors
By selecting the full available timespan of the data we obtain
a visualization that displays all the activities that involved the
PHARO system over a period of five months. This enables
to obtain a comprehensive view of the system evolution and
derive long-term considerations and properties. Figure 6 shows
the overall view of the available data.
One interesting example is represented by what we call the
awkward neighbors, i.e.,, big but silent packages that have
little or no activity.
98. Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Decoding a city: ArgoUML
4
Richard Wettel and Michele Lanza Visual Exploration of Large-Scale System Evolution
Fine-grained age map: JHotDraw
14
library packages:
java
javax
junit
org.w3c.dom
(classes) AllTests
CH.ifa.draw.standard
CH.ifa.draw.framework
CH.ifa.draw.figures
CH.ifa.draw.test
class DrawApplication
in CH.ifa.draw.application
class StandardDrawingView
in CH.ifa.draw.standard.
10:20 20:12
3:00 6:00 18:35 21:00 23:00 45:43 48:00 51:00 54:00 56:33
@andreamocci
R AE E LV
@michelelanza
@richardwettel
@yuriy_tymchuk
@robertominelli
Fig. 6. View of the city with all the activities
SC+ST
SC+IE
IE+ST
SC+ST+IE
PresentPast
SC
IE
ST
Fig. 5. Aging Process: Example in the Timeline
In the “present” interval (i.e., the one selected by the user),
colors are at their default saturation. In the “past” interval,
instead, the color saturation fades. At the end of this interval,
the nodes have the default color, i.e., light gray.
V. TELLING EVOLUTIONARY STORIES
To assess the usefulness of our approach we inspected the
recent history of the PHARO system. In the following we
present four stories that narrate the evolution of the system
supported by our blended view.
A. Those Awkward Neighbors
By selecting the full available timespan of the data we obtain
a visualization that displays all the activities that involved the
PHARO system over a period of five months. This enables
to obtain a comprehensive view of the system evolution and
derive long-term considerations and properties. Figure 6 shows
the overall view of the available data.
One interesting example is represented by what we call the
awkward neighbors, i.e.,, big but silent packages that have
little or no activity.