SlideShare a Scribd company logo
1 of 11
Download to read offline
Does Distributed Development Affect Software Quality?
                        An Empirical Case Study of Windows Vista

    Christian Bird1 , Nachiappan Nagappan2 , Premkumar Devanbu1 , Harald Gall3 , Brendan Murphy2
                                       1
                                           University of California, Davis, USA
                                                  2
                                                    Microsoft Research
                                           3
                                             University of Zurich, Switzerland
          {cabird,ptdevanbu}@ucdavis.edu {nachin,bmurphy}@microsoft.com gall@ifi.uzh.ch



                        Abstract                                   ent in collocated teams such as delayed feedback, restricted
                                                                   communication, less shared project awareness, difficulty of
    It is widely believed that distributed software develop-       synchronous communication, inconsistent development and
ment is riskier and more challenging than collocated de-           build environments, lack of trust and confidence between
velopment. Prior literature on distributed development in          sites, etc. [22]. While there are studies that have examined
software engineering and other fields discuss various chal-         the delay associated with distributed development and the
lenges, including cultural barriers, expertise transfer dif-       direct causes for them [12], there is a dearth of empirical
ficulties, and communication and coordination overhead.             studies that focus on the effect of distributed development
We evaluate this conventional belief by examining the over-        on software quality in terms of post-release failures.
all development of Windows Vista and comparing the post-               In this paper, we use historical development data from
release failures of components that were developed in a dis-       the implementation of Windows Vista along with post-
tributed fashion with those that were developed by collo-          release failure information to empirically evaluate the hy-
cated teams. We found a negligible difference in failures.         pothesis that globally distributed software development
This difference becomes even less significant when control-         leads to more failures. We focus on post-release failures
ling for the number of developers working on a binary.             at the level of individual executables and libraries (which
We also examine component characteristics such as code             we refer to as binaries) shipped as part of the operating sys-
churn, complexity, dependency information, and test code           tem and use the IEEE definition of a failure as “the inability
coverage and find very little difference between distributed        of a system of component to perform its required functions
and collocated components. Further, we examine the soft-           within specified performance requirements” [16].
ware process and phenomena that occurred during the Vista              Using geographical and commit data for the developers
development cycle and present ways in which the develop-           that worked on Vista, we divide the binaries produced into
ment process utilized may be insensitive to geography by           those developed by distributed and collocated teams and ex-
mitigating the difficulties introduced in prior work in this        amine the distribution of post-release failures in both popu-
area.                                                              lations. Binaries are classified as developed in a distributed
                                                                   manner if at least 25% of the commits came from locations
                                                                   other than where binary’s owner resides. We find that there
1. Introduction                                                    is a small increase in the number of failures of binaries writ-
                                                                   ten by distributed teams (hereafter referred to as distributed
    Globally distributed software development is an increas-       binaries) over those written by collocated teams (collocated
ingly common strategic response to issues such as skill            binaries). However, when controlling for team size, the dif-
set availability, acquisitions, government restrictions, in-       ference becomes negligible. In order to see if only smaller,
creased code size, cost and complexity, and other resource         less complex, or less critical binaries are chosen for dis-
constraints [5, 10]. In this paper, we examine develop-            tributed development (which could explain why distributed
ment that is globally distributed, but completely within Mi-       binaries have approximately the same number of failures),
crosoft. This style of global development within a single          we examined many properties, but found no difference be-
company is to be contrasted with outsourcing which in-             tween distributed and collocated binaries. We present our
volves multiple companies. It is widely believed that dis-         methods and findings in this paper.
tributed collaboration imposes many challenges not inher-              In section 2 we discuss the motivation and background


                                                               1
of this work and present a theory of distributed development           3,000 developers.
in the context of the Windows Vista development process.            3. We examine many complexity and maintenance char-
Section 3 summarizes related work, including prior empiri-             acteristics of the distributed and collocated binaries
cal quantitative, qualitative, and case studies as well as the-        to check for inherent differences that might influence
oretical papers. An explanation of the data and analysis as            post-release quality.
well as the quantitative results of this analysis is presented      4. Our study examines a project in which all sites in-
in section 4. We discuss these results, compare with prior             volved are part of the same company and have been
work, and give possible explanations for them in section 5.            using the same process and tools for years.
Finally, we present the threats to the validity of our study
in section 6 and conclude our paper with further avenues of          There is a large body of theory describing the difficulties
study in section 7.                                               inherent in distributed development. We summarize them
                                                                  here.
2. Motivation and Contributions                                   Communication suffers due to a lack of unintended and in-
                                                                  formal meetings [11]. Engineers do not get to know each
    Distributed software development is a general concept         other on a personal basis. Synchronous communication be-
that can be operationalized in various ways. Development          comes less common due to time zone and language barriers.
may be distributed along many types of dimensions and             Even when communication is synchronous, the communi-
have various distinctive characteristics [9]. There are key       cation channels, such as conference calls or instant mes-
questions that should be clarified when discussing a dis-          saging are less rich than face to face and collocated group
tributed software project. Who or what is distributed and         meetings. Developers may take longer to solve problems
at what level? Are people or the artifacts distributed? Are       because they lack the ability to step into a neighboring of-
people dispersed individually or dispersed in groups?             fice to ask for help. They may not even know the correct
    In addition, it is important to consider the way that de-     person to contact at a remote site.
velopers and other entities are distributed. The distribu-        Coordination breakdowns occur due to this lack of commu-
tion can be across geographical, organizational, temporal,        nication and lower levels of group awareness [4, 1]. When
and stakeholder boundaries [15]. A scenario involving one         managers must manage across large distances, it becomes
company outsourcing work to another will certainly dif-           more difficult to stay aware of each person’s task and how
fer from multiple teams working within the same company.          they are interrelated. Different sites often use different tools
A recent special issue of IEEE Software focused on glob-          and processes which can also make coordinating between
ally distributed development, but the majority of the papers      sites difficult.
dealt with offshoring relationships between separate com-
panies and outsourcing, which is likely very different from       Diversity in operating environments may cause manage-
distributed sites within the same company [2, 6, 7]. Even         ment problems [1]. Often there are relationships between
within a company, the development may or may not span             the organization doing development and external entities
organizational structure at different levels. Do geographi-       such as governments and third party vendors. In a geo-
cal locations span the globe, including multiple time zones,      graphically dispersed project, these entities will be different
languages, and cultures or are they simply in different cities    based on location (e.g. national policies on labor practices
of the same state or nation?                                      may differ between the U.S. and India).
    We are interested in studying the effect of globally dis-     Distance can reduce team cohesion in groups collaborating
tributed software development within the same company             remotely [22]. Eating, sharing an office, or working late
because there are many issues involved in outsourcing that        together to meet a deadline all contribute to a feeling of be-
are independent of geographical distribution (e.g. expertise      ing part of a team. These opportunities are diminished by
finding, different process and an asymmetric relationship).        distance.
Our main motivation is in confirming or refuting the no-
                                                                  Organizational and national cultural barriers may com-
tion that global software development leads to more failures
                                                                  plicate globally distributed work [5]. Coworkers must be
within the context of our setting.
                                                                  aware of cultural differences in communication behaviors.
    To our knowledge, this is the first large scale distributed
                                                                  One example of a cultural difference within Microsoft be-
development study of its kind. This study augments the cur-
                                                                  came apparent when a large company meeting was origi-
rent body of knowledge and differs from prior studies by
                                                                  nally (and unknowingly) planned on a major national holi-
making the following contributions:
                                                                  day for one of the sites involved.
 1. We examine distributed development at multiple levels            Based on these prior observations and an examination of
    of separation (building, campus, continent, etc.).            the hurdles involved in globally distributed development we
 2. We examine a very large scale software development            expect that difficulties in communication and coordination
    effort, composed of over 4,000 binaries and nearly            will lead to an increase in the number of failures in code pro-
duce by distributed teams over code from collocated teams.         nication via instant messaging and group chat rather than
We formulate our testable hypothesis formally.                     telephones, better ways of identifying experts to ask for in-
    H1: Binaries that are developed by teams of engineers          formation, and shared calendars and other presence aware-
that are distributed will have more post-release failures          ness tools to facilitate contact with remote colleagues.
than those developed by collocated engineers.                          Panjer et al [23] observed a portion of a geographically
    We are also interested to see if the binaries that are dis-    distributed team of developers for two months in a com-
tributed differ from the collocated counterparts in any sig-       mercial setting and interviewed seven developers from the
nificant ways. It is possible that managers, aware of the           team. Developers reported difficulty coordinating with dis-
difficulties mentioned above, may choose to develop sim-            tributed coworkers and sometimes assigned MRs based on
pler, less frequently changing, or less critical software in       location. They also created explicit relationships between
a distributed fashion. We therefore present our second hy-         MRs to capture and manage technical information.
pothesis.                                                              Thanh et al [21] examined the effect of distributed devel-
    H2: Binaries that are distributed will be less complex,        opment on delay between communications and time to res-
experience less code churn, and have fewer dependencies            olution of work items in the IBM’s Jazz project, which was
than collocated binaries.                                          developed by developers at five globally distributed sites.
                                                                   They categorized work items based on the number of dis-
3. Related Work                                                    tributed sites that contributed to their resolution and exam-
                                                                   ined the mean and median time to resolution and time be-
                                                                   tween comments on each work item. While Kruskal-Wallis
    There is a wealth of literature in the area of globally dis-
                                                                   tests showed a statistically significant difference in the times
tributed software development. It has been the focus of mul-
                                                                   for items that were more distributed, the Kendall Tau corre-
tiple special issues of IEEE Software, workshops at ICSE
                                                                   lations of time to resolution and time between comments
and the International Conference on Global Software Engi-
                                                                   with number of sites was extremely low (below 0.1 in both
neering. Here we survey important work in the area, includ-
                                                                   cases). This indicates that the effect of distributed collabo-
ing both studies and theory of globally distributed work in
                                                                   ration does not have a strong effect.
software development.
    There have been a number of experience reports for                 In [13], Herbsleb and Mockus formulate an empirical
globally distributed software development projects at var-         theory of coordination in software engineering and test hy-
ious companies including Siemens [14], Alcatel [8], Mo-            potheses based on this theory. They precisely define soft-
torola [1], Lucent [11], and Philips [17].                         ware engineering as requiring a sequence of choices for all
                                                                   of the decisions associated with a project. Each decision
Effects on bug resolution                                          constrains the project and future decisions in some way un-
   In an empirical study of globally distributed software de-      til all choices have been made and the final product does
velopment [12], Herbsleb and Mockus examined the time to           or does not satisfy the requirements. It is therefore impor-
resolution of Modification Requests (MRs) in two depart-            tant that only feasible decisions (those which will lead to
ments of Lucent working on distinct network elements for           a project that does satisfy the requirements) be made. The
a telecommunication system. They found that when an en-            presented theory is used to develop testable hypotheses re-
gineer was blocked on a task due to information needs, the         garding productivity, measured as number of MRs resolved
average delay was .9 days for same site information transfer       per unit time. They find that people who are assigned work
and 2.4 days if the need crossed site-boundaries. An MR is         from many sources have lower productivity and that MRs
classified as ”single-site” if all of contributors were resided     that require work in multiple modules have a longer cycle
at one site and ”distributed” otherwise. The average time          time than those which require changes to just one.
needed to complete an ”single-site” MR was 5 days versus               Unlike the above papers, our study focuses on the ef-
12.7 for ”distributed”. When controlling for other factors         fect of distributed development on defect occurrence, rather
such as number of people working on an MR, how diffused            than on defect resolution time.
the changes are across the code base, size of the change, and
                                                                   Effects on quality and productivity
severity, the effect of being distributed was no longer sig-
nificant. They hypothesize that large and/or multi-module               Diomidis Spinellis examined the effect of distributed de-
changes are both more time consuming and more likely to            velopment on productivity, code style, and defect density in
involve multiple sites. In addition, these changes require         the FreeBSD code base [25]. He measured the geographi-
more people, which introduce delay. They conclude that             cal distance between developers, the number of defects per
distributed development indirectly introduces delay due to         source file, as well as productivity in terms of number of
correlated factors such as team size and breadth of changes        lines committed per month. A correlation analysis showed
required.                                                          that there is little, if any, relationship between geographic
   They also introduce practices that may mitigate the risks       distance of developers and productivity and defect density.
of distributed development. These include better commu-            It should be noted that this is a study of open source soft-
ware which is, by its very nature, distributed and has a very     Development Locations          Code Size       Defect Density
different process model from commercial software.                                              (KLOC C/C++)     ( defects/KLOC)
    Cusick and Prasad [6] examined the practices used by          Beijing, China                           57                0.7
                                                                  Arlington Heights, US                    54                0.8
Wolters Kluwer Corporate Legal Services when outsourc-
                                                                  Arlington Heights, US                    74                1.3
ing software development tasks and present their model for        Tokyo, Japan                             56                0.2
deciding if a project is offshorable and how to manage it ef-     Bangalore, India                       165                 0.5
fectively. Their strategies include keeping communication         Singapore                                45                0.1
channels open, using consistent development environments          Adelaide, Australia                      60                0.5
and machine configurations, bringing offshore project leads
onsite for meetings, developing and using necessary infras-         Table 1. Locations, code size, and defect density from
tructure and tools, and managing where the control and do-          Motorola’s 3G trial project for each site
main expertise lies. They also point out that there are some
drawbacks that are difficult to overcome and should be ex-          • Reduce national and organizational cultural distance
pected such as the need for more documentation, more plan-         • Reduce temporal distance
ning for meetings, higher levels of management overhead,
and cultural differences. This work was based on an off-             Nagappan et al investigated the influence of organiza-
shoring relationship with a separate vendor and not collab-      tional structure on software quality in Windows Vista [20].
oration between two entities within the same company. We         They found a strong relationship between how development
expect that the challenges faced in distributed development      is distributed across the organizational structure and number
may differ based on the type of relationship between dis-        of post-release failures in binaries shipped with the operat-
tributed sites.                                                  ing system. Along with other organizational measures, they
    Our study examines distributed development in the con-       measured the level of code ownership by the organization
text of one commercial entity, which differs greatly from        that the binary owner belonged to, the number of organiza-
both open source projects and outsourcing relationships.         tions that contributed at least 10% to the binary, and the or-
                                                                 ganizational level of the person whose reporting engineers
Issues and solutions                                             perform more than 75% of the edits. Our paper comple-
   In his book on global software teams [4], Carmel cate-        ments this study by examining geographically, rather than
gorizes project risk factors into four categories that act as    organizationally distributed development.
centrifugal forces that pull global projects apart. These are:
                                                                 4. Methods and Analysis
  •   Loss of communication richness
  •   Coordination breakdowns
                                                                    In this section we describe our methods of gathering data
  •   Geographic dispersion                                      for our study and the analysis methods used to evaluate our
  •   Cultural Differences                                       hypotheses regarding distributed development in Windows
                                                                 Vista.
   In 2001, Battin et al [1] discuss the challenges and their
solutions relative to each of Carmel’s categories in a large
                                                                 4.1. Data Collection
scale project implementing the 3G Trial (Third Generation
Cellular System) at Motorola. By addressing these chal-
lenges in this project, they found that globally distributed        Windows Vista is a large commercial software project in-
software development did not increase defect density, and        volving nearly 3,000 developers. It comprises over 3,300+
in fact, had lower defect density than the industrial aver-      unique binaries (defined as individual files containing ma-
age. Table 1 lists the various locations, the size of the code   chine code such as executables or a libraries) with a source
developed at those locations, and their defect density. They     code base of over 60 MLOC. Developers were distributed
summarize the key actions necessary for success with global      across 59 buildings and 21 campuses in Asia, Europe and
development in order of importance:                              North America. Finally, it was developed completely in-
                                                                 house without any outsourced elements.
  • Use Liaisons                                                    Our data is focused on 3 properties: code quality, ge-
  • Distribute entire things for entire lifecycle                ographical location, and code ownership. Our measure of
  • Plan to accommodate time and distance                        code quality is post-release failures, since these matter most
                                                                 to end-users. These failures are recorded for the six months
   Carmel and Agarwal [5] present three tactics for allevi-      following the release of Vista at the binary level as shown
ating distance in global software development, each with         in figure 1.
examples, possible solutions, and caveats:                          Geographical location information for each software de-
                                                                 veloper at Microsoft is obtained from the people manage-
  • Reduce intensive collaboration                               ment software at the time of release to manufacturing of
Windows Vista

                                        Collect product metrics                      6 months to
        Software


                                 (commits, code churn, complexity, etc.)            collect failures



                   Windows Server 2003 –                              Vista Release
                           SP1


                                                                 Time
                                               Figure 1. Data collection timeline

Vista. This data includes the building, campus, region,            tinction is that it is easy to travel between buildings on the
country, and continent information. While some develop-            same campus by foot while travel between campuses even
ers occasionally move, it is standard practice at Microsoft        in the same city requires a vehicle.
to keep a software engineer at one location during an entire       Locality: We use localities to represent groups of campuses
product cycle. Most of the 2,757 developers of Vista didn’t        that are geographically close to each other. For instance,
move during the observation period.                                the Seattle locality contains all of the campuses in western
    Finally we gathered the number of commits made by              Washington. It’s possible to travel within a locality by car
each engineer to each binary. We remove build engineers            on day trips, but travel between localities often requires air
from the analysis because their changes are broad, but not         travel and multi-day trips. Also, all sites in a particular lo-
substantive. Many files have fields that need to be updated          cality operate in the same time zone, making coordination
prior to a build, but the actual source code is not modified.       and communication within a locality easier than between
By combining this data with developer geographical data,           localities.
we determine the level of distribution of each binary and          Continent: All of the locations on a given continent fall
categorize these levels into a hierarchy. Microsoft practices      into this category. We choose to group at the continent level
a strong code ownership development process. We found              rather than the country level because Microsoft has offices
that on average, 49% of the commits for a particular binary        in Vancouver Canada and we wanted those to be grouped
can be attributed to one engineer. Although we are basing          together with other west coast sites (Seattle to Vancouver is
our analysis on data from the development phase, in most           less than 3 hours by road). If developers are located in the
cases, this is indicative of the distribution that was present     same continent, but not the same region, then it is likely that
during the design phase as well.                                   cultural similarities exists, but they operate in different time
    We categorized the distribution of binaries into the fol-      zones and rarely speak face to face.
lowing geographic levels. Our reasoning behind this classi-        World: Binaries developed by engineers on different conti-
fication is explained below.                                        nents are placed in this category. This level of geographical
                                                                   distribution means that face to face meetings are rare and
Building: Developers who work in the same building (and
                                                                   synchronous communication such as phone calls or online
often the same floor) will enjoy more face to face and infor-
                                                                   chats are hindered by time differences. Also, cultural and
mal contact. A binary classified at the building level may
                                                                   language differences are more likely.
have been worked on by developers on different floors of
the same building.                                                     For every level of geographical dispersion there are more
Cafeteria: Several buildings share a cafeteria. One cafete-        than two entities from the lower level within that level. That
ria services between one and five nearby buildings. Devel-          is, Vista was developed in more that three continents, local-
opers in different, but nearby buildings, may ”share meals”        ities, etc. Each binary is assigned the lowest level in the
together or meet by chance during meal times. In addi-             hierarchy from which at least 75% of the commits were
tion, the typically shorter geographical distance facilitates      made. Thus, if engineers residing in one region make at
impromptu meetings.                                                least 75% of the commits for a binary, but there is no cam-
Campus: A campus represents a group of buildings in one            pus that accounts for 75%, then the binary is categorized at
location. For instance, in the US, there are multiple cam-         the region level. This threshold was chosen based on results
puses. Some campuses reside in the same city, but the dis-         of prior work on development distributed across organiza-
North America
             Redmond        Charlotte                                                                                             Asia
                                                                       cmroute.dll
             Bldg 40                                                                                                      Hyderabad
            John 67                        Bldg AP2
                                           Sal  2                                                                           Bldg 1
            Tom 10                                               cmroute.cpp
                                                                                                                           Joe 16
            Zach 3
                                                                                                                           Bob 10
                                                                       cmroute.h                                           Sarah 9
             Bldg 26                                                                                                       Cindy 8
            Jack 2                                                                                                         Fred 7
                                       Silicon Valley                         etc.
            Lucy 2                                                                                                         Erin 6
                                             Bldg 5                                                                        Jeff 2
             Bldg 30                       Lynn 2
            Matt 2



   Figure 2. Commits to the library cmroute.dll. For clarity, location of anonymized developers is shown only in terms of continents,
   regions, and buildings.


tional boundaries that is standardized across Windows [20].                                           Levels of distribution
Figure 2 illustrates the geographic distribution of commits                               A               B        C              D               E
to an actual binary (with names anonymized). To assess the
sensitivity of our results to this selection and address any                   Building       Cafeteria       Campus   Locality       Continent       World
threats to validity we performed the analysis using thresh-
olds of 60%, 75%, 90%, and 100% with consistently similar
                                                                               Figure 4. distribution levels in Windows Vista
results.
   Note that whether a binary is distributed or not is orthog-             Figure 3 illustrates the hierarchy and shows the propor-
onal to the actual location where it was developed. For in-            tion of binaries that fall into each category. Note that the
stance, some binaries that are classified at the building level         majority of binaries have over 75% of their commits com-
were developed entirely in a building in Hyderabad, India              ing from just one building. The reason that so few binaries
while others were owned in Redmond, Washington.                        fall into the continent level is that the Unites States is the
                                                                       only country which contains multiple regions. Although the
                                                                       proportion of binaries categorized above the campus level is
                                                                       barely 10%, this still represents a sample of over 380 bina-
                                                                       ries; enough for a strong level of statistical power.
                             World 5.9%                                    We initially examined the number of binaries and distri-
                                                                       bution of failures for each level of our hierarchy. In addi-
                         Continent 0.2%                                tion, we divided the binaries into ”distributed” and ”col-
                                                                       located” categories in five different ways using splits A
                             Locality 5.6%
                                                                       through E as shown in figure 4 (e.g. split B categorizes
                                                                       building and cafeteria level binaries as collocated and the
                                                                       rest as distributed). This was performed to see if there is
                             Campus 17%
                                                                       a level of distribution above which there is a noticeable in-
                             Cafeteria 2.3%                            crease in failures.
                                                                           These categorizations are used to determine if there is
                                                                       a level of distributedness above which there is a significant
                                Building                               increase in the number of failures. The results from analysis
                                 68%                                   of these dichotomized data sets were consistent in nearly all
                                                                       respects. We therefore present the results of the first data set
                                                                       and point out deviations between the data sets where they
                                                                       occurred.

                                                                       4.2. Experimental Analysis

                                                                          In order to test our hypothesis about the difference in
   Figure 3. Hierarchy of distribution levels in Windows               code quality between distributed and collocated develop-
   Vista
ment, we examined the distribution of the number of post-         Model 2 F Statistic = 720.74, p < .0005
release failures per binary in both populations. Figure 5         Variable          % increase        Std Err.      Significance
shows histograms of the number of bugs for distributed            (Constant)                              0.25        p < .0005
and non-distributed binaries. Absolute numbers are omit-          distributed             4.6%            0.25         p = .056
ted from the histograms for confidentiality, but the horizon-      numdevs                                 0.00        p < .0005
tal and vertical scales are the same for both histograms. A
visual inspection indicates that although the mass is differ-
ent, with more binaries categorized as collocated than dis-       efficient for all models were below 17%, and dropped even
tributed, the distribution of failures are very similar.          further to below 9% when controlling for number of devel-
    A Mann-Whitney test was used to quantitatively measure        opers (many were below this value, but the numbers cited
the difference in means because the number of failures was        are upper bounds). In addition, the effect of distributed in
not normally distributed [18]. The difference in means is         models that accounted for the number of developers was
statistically significant, but small. While the average num-       only statistically significant when dividing binaries at the
ber of failures per binary is higher when the binary was dis-     continents level.
tributed, the actual magnitude of the increase is only about         We also used linear regression to examine the effect of
8%. In a prior study by Herbsleb and Mockus [12], time            the level of distribution on the number of failures of a bi-
to resolution of modification requests was positively corre-       nary. Since the level of distribution is a nominal variable
lated with the level of distribution of the participants. After   that can take on six different values, we encode it into five
further analysis, they discovered that the level of distribu-     binary variables. The variable diff buildings is 1 if the bi-
tion was not significant when controlling for the number of        nary was distributed among different buildings that all were
people participating. We performed a similar analysis on          served by the same cafeteria and 0 otherwise. A value of 1
ourWe used linear regression to examine the effect of dis-
     data.                                                        for diff cafeterias means that the binary was developed by
tributed development on number of failures. Our initial           engineers who were served by multiple cafeterias, but were
model contained only the binary variable indicating whether       located on the same campus, etc. The percentage increase
or not the binary was distributed. The number of develop-         for each diff represents the increase in failures relative to bi-
ers working on a binary was then added to the model and           naries that are developed by engineers in the same building.
we examined the coefficients in the model and amount of
variance in failures explained by the predictor variables. In     Model 3 F Statistic = 25.48, p < .0005
these models, distributed is a binary variable indicating if      Variable            % increase       Std Err.     Significance
the binary is distributed and numdevs is the number of de-        (Constant)                               0.09       p < .0005
velopers that worked on the binary. We show here the re-          diff buildings            15.1%          0.50       p < .0005
sults of analysis when splitting the binaries at the regions      diff cafeterias           16.3%          0.21       p < .0005
level. The F-statistic and p value show how likely the null       diff campuses             12.6%          0.35       p < .0005
hypothesis (the hypothesis that the predictor variable has no     diff localities            2.6%          1.47        p = .824
effect on the response variable) is. We give the percentage       diff continents           -5.1%          0.31        p = .045
increase in failures when the binaries are distributed based
on the parameter values. As numdevs is only included in              The parameter estimates of the model indicate that bi-
the models to examine effect of distribution when control-        naries developed by engineers on the same campus served
ling for number of developers we do not include estimates         by different cafeterias have, on average, 16% more post-
or percentage increase.                                           release failures than binaries developed in the same build-
    In models 1 - 4 we give the percentage increase in fail-      ing. Interestingly, the change in number of failures is quite
ures when the binaries are distributed based on the parame-       low for those developed in multiple regions and continents.
ter values also.                                                  However, when controlling for development team size, only
                                                                  binaries categorized at the levels of different cafeterias and
Model 1 F Statistic = 12.43, p < .0005
                                                                  different campuses show a statistically significant increase
Variable         % increase        Std Err.      Significance      in failures over binaries developed in the same building.
(Constant)                             0.30        p < .0005      Even so, the actual effects are relatively minor (4% and 6%
distributed             9.2%           0.31        p < .0005
                                                                  respectively).
                                                                     Two important observations can be made from these
   This indicates that on average, a distributed binary has       models. The first is that the variance explained by the pre-
9.2% more failures than a collocated binary. However, the         dictor variables (as measured in the adjusted R2 value) for
result changes then controlling for the number of developers      the built models rises from 2% and 4% (models 1 and 3)
working on a binary.                                              to 33% (models 2 and 4) when adding the number of de-
   We performed this analysis on all five splits of the bina-      velopers. The second is that when controlling for the num-
ries as shown in figure 4. The estimates for distributed co-       ber of developers, not all levels of distribution show a sig-
Post-Release Failures

                                     Distributed                                         Collocated
        Number of Binaries




                                          Failures                                        Failures
   Figure 5. Histograms of the number of failures per binary for distributed (left) and collocated (right) binaries. Although numbers
   are not shown on the axes, the scales are the same in both histograms.



Model 4 F Statistic = 242.73, p < .0005                                 about which binaries should be developed in a distributed
Variable                     % increase      Std Err.   Significance     manner. For instance, prior work has shown that the num-
(Constant)                                       0.09     p < .0005     ber of failures is highly correlated with code complexity and
diff buildings                    2.6%           0.42      p = .493     number of dependencies [19, 26]. Therefore, it is possible
diff cafeterias                   3.9%           0.18      p = .016     that in an effort to mitigate the perceived dangers of dis-
diff campuses                     6.3%           0.29      p = .019     tributed development, only less complex binaries or those
diff localities                   8.3%           1.23      p = .457     with less dependencies were chosen.
diff continents                  -3.9%           0.26      p = .101
                                                                            We also gathered metrics for each of the binaries in an
numdevs                                          0.00     p < .0005
                                                                        attempt to determine if there is a difference in the nature
                                                                        of binaries that are distributed. These measures fall into 5
                                                                        broad categories.
nificant effect, but the increase in post-release failures for
those that do is minimal with values at or below 6%. To put             Size & Complexity: Our code size and complexity mea-
this into perspective, a binary with 4 failures if collocated           sures include number of independent paths through the
would have 4.24 failures if distributed. Although our re-               code, number of functions, classes, parameters, blocks,
sponse variable is different from Herbsleb and Mockus, our              lines, local and global variables, and cyclomatic complex-
findings are consistent with their result that when control-             ity. From the call graph we extract the fan in and fan out
ling for the number of people working on a development                  of each function. For object oriented code we include mea-
task, distribution does not have a large effect. Based on               sures of class coupling, inheritance depth, the number of
these results, we are unable to reject the null hypothesis and          base classes, subclasses and class methods, and the number
H1 is not confirmed.                                                     of public, protected, and private data members and methods.
   This leads to the surprising conclusion that in the context          All of these are measured as totals for the whole binary and
in which Windows Vista was developed, teams that were                   as maximums on a function or class basis as applicable.
distributed wrote code that had virtually the same number               Code Churn: As measures of code churn we examine the
of post-release failures as those that were collocated.                 change in size of the binary, the frequency of edits and the
                                                                        churn size in terms of lines removed, added, and modified
4.3. Differences in Binaries                                             from the beginning of Vista development to RTM.
                                                                        Test Coverage: The number of blocks and arcs as well as
    One possible explanation for this lack of difference in             the block coverage and arc coverage are recorded during the
failures could be that distributed binaries are smaller, less           testing cycle for each binary.
complex, have fewer dependencies, etc. Although the num-                Dependencies: Many binaries have dependencies on one
ber of failures changes only minimally when the binaries are            another (in the form of method calls, data types, registry
distributed, we are interested in the differences in character-         values that are read or written, etc.). We calculate the num-
istics between distributed and non-distributed binaries. This           ber of direct incoming and outgoing dependencies as well
was done to determine if informed decisions were made                   as the transitive closer of these dependencies. The depth in
the dependency graph is also recorded.                            in culture and business context. These observations come
People: We include a number of statistics on the people and       from discussions from management as well as senior and
organizations that worked on the binaries. These include all      experienced employees at many of the development sites.
of the metrics in our prior organizational metrics paper [20]
                                                                  Relationship Between Sites: Much of the work on
such as the number of engineers that worked on the binary.
                                                                  distributed development examines outsourcing relation-
   We began with a manual inspection of the 20 binaries
                                                                  ships [7, 2]. Other work has looked at strategic partner-
with the least and 20 binaries with the most number of post-
                                                                  ships between companies or scenarios in which a foreign
release failures in both the distributed and non-distributed
                                                                  remote site was recently acquired [11]. All of these create
categories and examined the values of the metrics described
                                                                  situations where there the relationships are asymmetric and
above. The only discernible differences were metrics rela-
                                                                  engineers at different sites may feel competitive or may for
tive to the number of people working on the code, such as
                                                                  other reasons be less likely to help each other. In our situ-
team size.
                                                                  ation, each of the sites has existed for a long time and has
 Metric              Avg Value     Correlation    Significance
                                                                  worked on software together for many years. There is no
 Functions              895.86           0.114      p < .0005
 Complexity            4603.20           0.069      p < .0005
                                                                  threat that if one site performs better, the other will be shut
 Churn Size              53430           0.057       p = .033     down. The pay scale and benefits to employees are equiva-
 Edits                    63.82          0.134      p < .0005     lent at all sites in the company.
 Indegree                 13.04         -0.024       p = .363     Cultural Barriers: In a study of distributed development
 Outdegree                 9.67          0.100      p < .0005     within Lucent at sites in Great Britain and Germany, Herb-
 Number of Devs           21.55          0.183      p < .0005
                                                                  sleb and Grinter [11] found that significant national cultural
    We evaluated the effect of these metrics on level of dis-     barriers existed. These led to a lack of trust between sites
tribution in the entire population by examining the spear-        and misinterpreted actions due to lack of cultural context.
man rank correlation of distribution level of binaries (not       This problem was alleviated when a number of engineers
limited to the ”top 20” lists) with the code metrics. Most        (liaisons) from one site visited another for an extended pe-
metrics had correlation levels below 0.1 and the few that         riod of time. Battin et al. [1] found that when people from
were above that level, such as number of engineers never          different sites spent time working together in close prox-
exceeded 0.25. Logistic regression was used to examine            imity, many issues such as trust, perceived ability, delayed
the relationship of the development metrics with distribu-        response to communication requests, etc. were assuaged.
tion level. The increase in classification accuracy between a          A similar strategy was used during the development of
naive model including no independent variables and a step-        Vista. Development occurred mostly in the US (predom-
wise refined model with 15 variables was only 4%. When             inantly in Redmond) and Hyderabad, India. In the initial
removing data related to people that worked on the source,        development phases, a number of engineers and executives
the refined model’s accuracy only improved 2.7% from the           left Redmond to work at the Indian site. These people had a
naive model. We include the average values for a repre-           long history, many with 10+ years within Microsoft. They
sentative sample of the metrics along with a spearman rank        understood the company’s development process and had
correlation with the level of distribution for the binaries and   domain expertise. In addition, the majority of these em-
the significance of the correlation. Although the p-values         ployees were originally from India, removing one key chal-
are quite low, the magnitude of the correlation is small.         lenge from globally distributed work. These people could
This is attributable to the very large sample of binaries (over   therefore act as facilitators, information brokers, recom-
3,000).                                                           menders, and cultural liaisons [5] and had already garnered
    All of these results lead to the conclusion that there is     a high level of trust and confidence from the engineers in
no discernible difference in the measured metrics between         the US. Despite constituting only a small percent of the In-
binaries that are distributed and those that aren’t.              dian workforce, they helped to reduce both organizational
                                                                  and national cultural distances [5].
5. Discussion                                                     Communication: Communication is the single most refer-
                                                                  enced problem in globally distributed development. Face
    We have presented an unexpected, but very encourag-           to face meetings are difficult and rare and people are less
ing result: it is possible to conduct in-house globalized dis-    likely to communicate with others that they don’t know per-
tributed development without adversely impacting quality.         sonally. In addition, distributed sites are more likely to
It is certainly important to understand why this occurred,        use asynchronous communication channels such as email
and how this experience can be repeated in other projects         which introduce a task resolution delay [24]. A prior study
and contexts. To prime this future endeavor, in this section,     of global software servicing within Microsoft has examined
we make some observations concerning pertinent practices          the ways in which distributed developers overcome com-
that could have improved communication, co-ordination,            munication difficulties and information needs through the
team cohesion, etc., and reduced the impact of differences        use of knowledge bases and requesting help from local and
remote personnel [3].                                             dispersed developers to be more integrated into the com-
   The Vista developers made heavy use of synchronous             pany and the project. The manager can act as a facilitator
communication daily. Employees took on the responsibility         between engineers who may be familiar with one another
of staying at work late or arriving early for a status confer-    and can also spot problems due to poor coordination ear-
ence call on a rotating basis, changing the site that needed to   lier than in an organizational structure based on geography
keep odd hours every week. Keeping in close and frequent          where there is little coupling between sites. Prior work has
contact increases the level of awareness and the feeling of       shown that organizationally distributed development dra-
”teamness” [1, 5]. This also helps to convey status and re-       matically affects the number of post-release defects [20].
solve issues quickly before they escalate. In addition, En-       This organizational integration across geographic bound-
gineers also regularly traveled between remote sites during       aries reconciles the results of that work with the conclusions
development for important meetings.                               reached in this study. In addition, because the same process
Consistent Use of Tools: Both Battin [1] and Herbsleb [12]        has been used in all locations of the company for some time,
cite the importance of the configuration management tools          organizational culture is fairly consistent across geography.
used. In the case of Motorola’s project, a single, distributed
configuration management tool was used with great suc-             6. Threats to Validity
cess. At Lucent, each site used their own management tools,
which led to an initial phase of rapid development at the         Construct Validity
cost of very cumbersome and inefficient integration work               The data collection on a system the size of Windows
towards the end. Microsoft employs the use of one config-          Vista is automated. Metrics and other data were collected
uration management and build system throughout all of its         using production level quality tools and we have no reason
sites. Thus every engineer is familiar with the same source       to believe that there were large errors in measurement.
code management tools, development environment, docu-             Internal Validity
mentation method, defect tracking system, and integration             In section 5 we listed observations about the distributed
process. In addition, the integration process for code is in-     development process used at Microsoft. While we have rea-
cremental, allowing problems to surface before making it          son to believe that these alleviate the problems associated
into a complete build of the entire system.                       with distributed development, a causal relationship has not
End to End Ownership: One problem with distributed de-            been empirically shown. Further study is required to deter-
velopment is distributed ownership. When an entity fails,         mine to what extent each of these practices actually helps.
needs testing, or requires a modification, it may not be clear     In addition, although we attempted an exhaustive search of
who is responsible for performing the task or assigning the       differences in characteristics between distributed an collo-
work. Battin mentions ownership of a component for the            cated binaries, it’s possible that they differ in some way not
entire lifecycle as one of three critical strategies when dis-    measured by our analysis in section 4.3.
tributing development tasks. While there were a number            External Validity
of binaries that were committed to from different sites dur-          It is unclear how well our results generalize to other sit-
ing the implementation phase of Vista, Microsoft practices        uations. We examine one large project and there is a dearth
strong code ownership. Thus, one developer is clearly ”in         of literature that examines the effect of distributed devel-
control” of a particular piece of code from design, through       opment on post-release failures. We have identified simi-
implementation, and into testing and maintenance. Effort          larities in Microsoft’s development process with other suc-
is made to minimize the number of ownership changes that          cessful distributed projects, which may indicate important
occur during development.                                         principles and strategies to use. There are many ways in
Common Schedules: All of the development that we ex-              which distributed software projects may vary and the par-
amined was part of one large software project. The project        ticular characteristics must be taken into account. For in-
was not made up of distributed modules that shipped sepa-         stance, we have no reason to expect that a study of an out-
rately. Rather Vista had a fixed release date for all parties      sourced project would yield the same results as ours. It is
and milestones were shared across all sites. Thus all engi-       also not clear how these results relate to defect fixing in the
neers had a strong interest in working together to accom-         distributed world as mentioned in prior work[3].
plish their tasks within common time frames.
Organizational Integration: Distributed sites in Microsoft        7. Conclusion
do not operate in organizational isolation. There is no top
level executive at India or China that all the engineers in          In our study we divide binaries based on the level of ge-
those locations report to. Rather, the organizational struc-      ographic dispersion of their commits. We studied the post-
ture spans geographical locations at low levels. It is not un-    release failures for the Windows Vista code base and con-
common for engineers at multiple sites may to have a com-         cluded that distributed development has little to no effect.
mon direct manager. This, in turn, causes geographically          We posit that this negative result is a significant finding as
it refutes, at least in the context of Vista development, con-     [13] J. D. Herbsleb and A. Mockus. Formulation and prelimi-
ventional wisdom and widely held beliefs about distributed              nary test of an empirical theory of coordination in software
development. When coupled with prior work, [1, 12] our                  engineering. In ESEC/FSE-11: Proceedings of 11th ACM
results support the conclusion that there are scenarios in              SIGSOFT International Symposium on Foundations of Soft-
which distributed development can work for large software               ware Engineering, pages 138–137, Helsinki, Finland, 2003.
                                                                   [14] J. D. Herbsleb, D. J. Paulish, and M. Bass. Global software
projects. Based on earlier work [20] our study shows that
                                                                        development at siemens: experience from nine projects. In
Organizational differences are much stronger indicators of              Proceedings of the 27th International Conference on Soft-
quality than geography. An Organizational compact but ge-               ware Engineering, pages 524–533. ACM, 2005.
ographically distributed project might be better than an ge-       [15] H. Holmstrom, E. Conchuir, P. Agerfalk, and B. Fitzgerald.
ographically close organizationally distributed project. We             Global software development challenges: A case study on
have presented a number of observations about the devel-                temporal, geographical and socio-cultural distance. Pro-
opment practices at Microsoft which may mitigate some of                ceedings of the IEEE International Conference on Global
the hurdles associated with distributed development, but no             Software Engineering (ICGSE’06)-Volume 00, pages 3–11,
causal link has been established. There is a strong simi-               2006.
larity between these practices and those that have worked          [16] IEEE. IEEE Standard 982.2-1988, IEEE Guide for the Use
                                                                        of IEEE Standard Dictionary of Measures to Produce Reli-
for other teams in the past [1] as well as solutions proposed
                                                                        able Software. 1988.
in other work [11]. Directly examining the effects of these        [17] R. Kommeren and P. Parviainen. Philips experiences in
practices is an important direction for continued research in           global distributed software development. Empirical Soft-
globally distributed software development.                              ware Engineering, 12(6):647–660, 2007.
                                                                   [18] H. B. Mann and W. D. R. On a test of whether one of two
References                                                              random variables is stochastically larger than the other. An-
                                                                        nals of Mathematical Statistics, 18(1):50–60, 1947.
                                                                   [19] N. Nagappan, T. Ball, and A. Zeller. Mining metrics to pre-
 [1] R. D. Battin, R. Crocker, J. Kreidler, and K. Subramanian.         dict component failures. In Proceedings of the International
     Leveraging resources in global software development. IEEE          Conference on Software Engineering, pages 452–461, 2006.
     Software, 18(2):70–77, March/April 2001.                      [20] N. Nagappan, B. Murphy, and V. Basili. The influence of
 [2] J. M. Bhat, M. Gupta, and S. N. Murthy. Overcoming                 organizational structure on software quality: An empirical
     requirements engineering challenges: Lessons from off-             case study. In ICSE ’08: Proceedings of the 30th interna-
     shore outsourcing. IEEE Software, 23(6):38–44, Septem-             tional conference on Software engineering, pages 521–530,
     ber/October 2006.                                                  Leipzig, Germany, 2008.
 [3] S. Bugde, N. Nagappan, S. Rajamani, and G. Ramalingam.        [21] T. Nguyen, T. Wolf, and D. Damian. Global software devel-
     Global software servicing: Observational experiences at mi-        opment and delay: Does distance still matter? In Proceed-
     crosoft. In IEEE International Conference on Global Soft-          ings of the International Conference on Global Software En-
     ware Engineering, 2008.                                            gineering, 2008.
 [4] E. Carmel. Global Software Teams: Collaborating Across
                                                                   [22] G. M. Olson and J. S. Olson. Distance matters. Human-
     Borders and Time Zones. Prentice Hall, 1999.
                                                                        Computer Interaction, 15(2/3):139–178, 2000.
 [5] E. Carmel and R. Agarwal. Tactical approaches for allevi-
                                                                   [23] L. D. Panjer, D. Damian, and M.-A. Storey. Cooperation
     ating distance in global software development. IEEE Soft-
                                                                        and coordination concerns in a distributed software devel-
     ware, 2(18):22–29, March/April 2001.
                                                                        opment project. In Proceedings of the 2008 International
 [6] J. Cusick and A. Prasad. A practical management and engi-
                                                                        Workshop on Cooperative and Human Aspects of Software
     neering approach to offshore collaboration. IEEE Software,
                                                                        Engineering, pages 77–80. ACM, 2008.
     23(5):20–29, September/October 2006.
 [7] K. C. Desouza, Y. Awaza, and P. Baloh. Managing knowl-        [24] M. Sosa, S. Eppinger, M. Pich, D. McKendrick, S. Stout,
     edge in global software development efforts: Issues and            T. Manage, and F. Insead. Factors that influence techni-
     practices. IEEE Software, 23(5):30–37, Sept/Oct 2006.              cal communication in distributed product development: an
 [8] C. Ebert and P. D. Neve. Surviving global software develop-        empirical study in the telecommunications industry. Engi-
     ment. IEEE Software, 18(2):62–69, 2001.                            neering Management, IEEE Transactions on, 49(1):45–58,
 [9] D. C. Gumm. Distribution dimensions in software develop-           2002.
     ment projects: a taxonomy. IEEE Software, 23(5):45–51,        [25] D. Spinellis. Global software development in the freebsd
     2006.                                                              project. In GSD ’06: Proceedings of the 2006 international
[10] J. Herbsleb. Global Software Engineering: The Future of            workshop on Global software development for the practi-
     Socio-technical Coordination. International Conference on          tioner, pages 73–79, Shanghai, China, 2006.
     Software Engineering, pages 188–198, 2007.                    [26] T. Zimmermann and N. Nagappan. Predicting defects using
[11] J. Herbsleb and R. Grinter. Architectures, coordination,           network analysis on dependency graphs. In Proceedings of
     and distance: Conway’s law and beyond. IEEE Software,              the 30th International Conference on Software Engineering,
     16(5):63–70, 1999.                                                 pages 531–540. ACM, 2008.
[12] J. Herbsleb and A. Mockus. An empirical study of speed
     and communication in globally distributed software de-
     velopment. IEEE Transactions on Software Engineering,
     29(6):481–94, 2003.

More Related Content

What's hot

Agile Knowledge Sharing
Agile Knowledge SharingAgile Knowledge Sharing
Agile Knowledge SharingCSCJournals
 
Recommender systems in the scope of opinion formation: a model
Recommender systems in the scope of opinion formation: a modelRecommender systems in the scope of opinion formation: a model
Recommender systems in the scope of opinion formation: a modelMarcel Blattner, PhD
 
Apeendices of systematic mapping study about teams in software engineering
Apeendices of systematic mapping study about teams in software engineeringApeendices of systematic mapping study about teams in software engineering
Apeendices of systematic mapping study about teams in software engineeringMarcos Cardoso
 
Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Wolfgang Reinhardt
 
Socially Aware Device To Device Communications
Socially Aware Device To Device CommunicationsSocially Aware Device To Device Communications
Socially Aware Device To Device Communicationsijtsrd
 
A Study on Managing Conflict among Women Employees in IT Sector Bangalore
A Study on Managing Conflict among Women Employees in IT Sector BangaloreA Study on Managing Conflict among Women Employees in IT Sector Bangalore
A Study on Managing Conflict among Women Employees in IT Sector BangaloreDr. Amarjeet Singh
 
Liberty university coms 560 final paper
Liberty university coms 560 final paperLiberty university coms 560 final paper
Liberty university coms 560 final paperleesa marteen
 
A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...
A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...
A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...Connie White
 

What's hot (11)

Agile Knowledge Sharing
Agile Knowledge SharingAgile Knowledge Sharing
Agile Knowledge Sharing
 
Recommender systems in the scope of opinion formation: a model
Recommender systems in the scope of opinion formation: a modelRecommender systems in the scope of opinion formation: a model
Recommender systems in the scope of opinion formation: a model
 
50120130406031
5012013040603150120130406031
50120130406031
 
Hy2514281431
Hy2514281431Hy2514281431
Hy2514281431
 
Apeendices of systematic mapping study about teams in software engineering
Apeendices of systematic mapping study about teams in software engineeringApeendices of systematic mapping study about teams in software engineering
Apeendices of systematic mapping study about teams in software engineering
 
Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...Personal dashboards for individual learning and project awareness in social s...
Personal dashboards for individual learning and project awareness in social s...
 
Socially Aware Device To Device Communications
Socially Aware Device To Device CommunicationsSocially Aware Device To Device Communications
Socially Aware Device To Device Communications
 
DMDI
DMDIDMDI
DMDI
 
A Study on Managing Conflict among Women Employees in IT Sector Bangalore
A Study on Managing Conflict among Women Employees in IT Sector BangaloreA Study on Managing Conflict among Women Employees in IT Sector Bangalore
A Study on Managing Conflict among Women Employees in IT Sector Bangalore
 
Liberty university coms 560 final paper
Liberty university coms 560 final paperLiberty university coms 560 final paper
Liberty university coms 560 final paper
 
A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...
A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...
A Real Time Online Delphi Decision System, V 2.0: Crisis Management Support d...
 

Similar to Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista

AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...
AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...
AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...ijseajournal
 
Distributed Agile Development: Practices for building trust in team through E...
Distributed Agile Development: Practices for building trust in team through E...Distributed Agile Development: Practices for building trust in team through E...
Distributed Agile Development: Practices for building trust in team through E...Waqas Tariq
 
Advances in Technology Project Management: Review of Open Source Software Int...
Advances in Technology Project Management: Review of Open Source Software Int...Advances in Technology Project Management: Review of Open Source Software Int...
Advances in Technology Project Management: Review of Open Source Software Int...Maurice Dawson
 
Ko tse06-developers behaviour
Ko tse06-developers behaviourKo tse06-developers behaviour
Ko tse06-developers behaviourPtidejPoly
 
Distributed agile and offshoring - antagonism or symbiosis?
Distributed agile and offshoring - antagonism or symbiosis?Distributed agile and offshoring - antagonism or symbiosis?
Distributed agile and offshoring - antagonism or symbiosis?Mindtree Ltd.
 
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...ijcsit
 
Understanding Continuous Design in F/OSS Projects
Understanding Continuous Design in F/OSS ProjectsUnderstanding Continuous Design in F/OSS Projects
Understanding Continuous Design in F/OSS ProjectsBetsey Merkel
 
JuliaThere is an art to projecting management. Knowing how to m.docx
JuliaThere is an art to projecting management. Knowing how to m.docxJuliaThere is an art to projecting management. Knowing how to m.docx
JuliaThere is an art to projecting management. Knowing how to m.docxtawnyataylor528
 
Technology Integration Pattern For Distributed Scrum of Scrum
Technology Integration Pattern For Distributed Scrum of ScrumTechnology Integration Pattern For Distributed Scrum of Scrum
Technology Integration Pattern For Distributed Scrum of ScrumIOSR Journals
 
Mining developer communication data streams
Mining developer communication data streamsMining developer communication data streams
Mining developer communication data streamscsandit
 
Exploiting semantics-in-collaborative-software-development-tasks
Exploiting semantics-in-collaborative-software-development-tasksExploiting semantics-in-collaborative-software-development-tasks
Exploiting semantics-in-collaborative-software-development-tasksDimitris Panagiotou
 
HYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW
HYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEWHYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW
HYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEWijseajournal
 
Evolution of social developer network in oss survey
Evolution of social developer network in oss surveyEvolution of social developer network in oss survey
Evolution of social developer network in oss surveyeSAT Publishing House
 
Conway corollary
Conway corollaryConway corollary
Conway corollaryInam Soomro
 
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...IJITE
 

Similar to Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista (20)

AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...
AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...
AGILE SOFTWARE ARCHITECTURE INGLOBAL SOFTWARE DEVELOPMENT ENVIRONMENT:SYSTEMA...
 
Distributed Agile Development: Practices for building trust in team through E...
Distributed Agile Development: Practices for building trust in team through E...Distributed Agile Development: Practices for building trust in team through E...
Distributed Agile Development: Practices for building trust in team through E...
 
Prezentation
PrezentationPrezentation
Prezentation
 
Advances in Technology Project Management: Review of Open Source Software Int...
Advances in Technology Project Management: Review of Open Source Software Int...Advances in Technology Project Management: Review of Open Source Software Int...
Advances in Technology Project Management: Review of Open Source Software Int...
 
Ko tse06-developers behaviour
Ko tse06-developers behaviourKo tse06-developers behaviour
Ko tse06-developers behaviour
 
Collaborative technologies
Collaborative technologiesCollaborative technologies
Collaborative technologies
 
Distributed agile and offshoring - antagonism or symbiosis?
Distributed agile and offshoring - antagonism or symbiosis?Distributed agile and offshoring - antagonism or symbiosis?
Distributed agile and offshoring - antagonism or symbiosis?
 
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
An Sna-Bi Based System for Evaluating Virtual Teams: A Software Development P...
 
Understanding Continuous Design in F/OSS Projects
Understanding Continuous Design in F/OSS ProjectsUnderstanding Continuous Design in F/OSS Projects
Understanding Continuous Design in F/OSS Projects
 
JuliaThere is an art to projecting management. Knowing how to m.docx
JuliaThere is an art to projecting management. Knowing how to m.docxJuliaThere is an art to projecting management. Knowing how to m.docx
JuliaThere is an art to projecting management. Knowing how to m.docx
 
Groupware/CSCW
Groupware/CSCWGroupware/CSCW
Groupware/CSCW
 
Technology Integration Pattern For Distributed Scrum of Scrum
Technology Integration Pattern For Distributed Scrum of ScrumTechnology Integration Pattern For Distributed Scrum of Scrum
Technology Integration Pattern For Distributed Scrum of Scrum
 
Mining developer communication data streams
Mining developer communication data streamsMining developer communication data streams
Mining developer communication data streams
 
Exploiting semantics-in-collaborative-software-development-tasks
Exploiting semantics-in-collaborative-software-development-tasksExploiting semantics-in-collaborative-software-development-tasks
Exploiting semantics-in-collaborative-software-development-tasks
 
HYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW
HYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEWHYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW
HYBRID PRACTICES IN GLOBAL SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW
 
Evolution of social developer network in oss survey
Evolution of social developer network in oss surveyEvolution of social developer network in oss survey
Evolution of social developer network in oss survey
 
DOC
DOCDOC
DOC
 
Conway corollary
Conway corollaryConway corollary
Conway corollary
 
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
 
Jharna Software
Jharna SoftwareJharna Software
Jharna Software
 

Recently uploaded

Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 

Recently uploaded (20)

Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 

Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista

  • 1. Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista Christian Bird1 , Nachiappan Nagappan2 , Premkumar Devanbu1 , Harald Gall3 , Brendan Murphy2 1 University of California, Davis, USA 2 Microsoft Research 3 University of Zurich, Switzerland {cabird,ptdevanbu}@ucdavis.edu {nachin,bmurphy}@microsoft.com gall@ifi.uzh.ch Abstract ent in collocated teams such as delayed feedback, restricted communication, less shared project awareness, difficulty of It is widely believed that distributed software develop- synchronous communication, inconsistent development and ment is riskier and more challenging than collocated de- build environments, lack of trust and confidence between velopment. Prior literature on distributed development in sites, etc. [22]. While there are studies that have examined software engineering and other fields discuss various chal- the delay associated with distributed development and the lenges, including cultural barriers, expertise transfer dif- direct causes for them [12], there is a dearth of empirical ficulties, and communication and coordination overhead. studies that focus on the effect of distributed development We evaluate this conventional belief by examining the over- on software quality in terms of post-release failures. all development of Windows Vista and comparing the post- In this paper, we use historical development data from release failures of components that were developed in a dis- the implementation of Windows Vista along with post- tributed fashion with those that were developed by collo- release failure information to empirically evaluate the hy- cated teams. We found a negligible difference in failures. pothesis that globally distributed software development This difference becomes even less significant when control- leads to more failures. We focus on post-release failures ling for the number of developers working on a binary. at the level of individual executables and libraries (which We also examine component characteristics such as code we refer to as binaries) shipped as part of the operating sys- churn, complexity, dependency information, and test code tem and use the IEEE definition of a failure as “the inability coverage and find very little difference between distributed of a system of component to perform its required functions and collocated components. Further, we examine the soft- within specified performance requirements” [16]. ware process and phenomena that occurred during the Vista Using geographical and commit data for the developers development cycle and present ways in which the develop- that worked on Vista, we divide the binaries produced into ment process utilized may be insensitive to geography by those developed by distributed and collocated teams and ex- mitigating the difficulties introduced in prior work in this amine the distribution of post-release failures in both popu- area. lations. Binaries are classified as developed in a distributed manner if at least 25% of the commits came from locations other than where binary’s owner resides. We find that there 1. Introduction is a small increase in the number of failures of binaries writ- ten by distributed teams (hereafter referred to as distributed Globally distributed software development is an increas- binaries) over those written by collocated teams (collocated ingly common strategic response to issues such as skill binaries). However, when controlling for team size, the dif- set availability, acquisitions, government restrictions, in- ference becomes negligible. In order to see if only smaller, creased code size, cost and complexity, and other resource less complex, or less critical binaries are chosen for dis- constraints [5, 10]. In this paper, we examine develop- tributed development (which could explain why distributed ment that is globally distributed, but completely within Mi- binaries have approximately the same number of failures), crosoft. This style of global development within a single we examined many properties, but found no difference be- company is to be contrasted with outsourcing which in- tween distributed and collocated binaries. We present our volves multiple companies. It is widely believed that dis- methods and findings in this paper. tributed collaboration imposes many challenges not inher- In section 2 we discuss the motivation and background 1
  • 2. of this work and present a theory of distributed development 3,000 developers. in the context of the Windows Vista development process. 3. We examine many complexity and maintenance char- Section 3 summarizes related work, including prior empiri- acteristics of the distributed and collocated binaries cal quantitative, qualitative, and case studies as well as the- to check for inherent differences that might influence oretical papers. An explanation of the data and analysis as post-release quality. well as the quantitative results of this analysis is presented 4. Our study examines a project in which all sites in- in section 4. We discuss these results, compare with prior volved are part of the same company and have been work, and give possible explanations for them in section 5. using the same process and tools for years. Finally, we present the threats to the validity of our study in section 6 and conclude our paper with further avenues of There is a large body of theory describing the difficulties study in section 7. inherent in distributed development. We summarize them here. 2. Motivation and Contributions Communication suffers due to a lack of unintended and in- formal meetings [11]. Engineers do not get to know each Distributed software development is a general concept other on a personal basis. Synchronous communication be- that can be operationalized in various ways. Development comes less common due to time zone and language barriers. may be distributed along many types of dimensions and Even when communication is synchronous, the communi- have various distinctive characteristics [9]. There are key cation channels, such as conference calls or instant mes- questions that should be clarified when discussing a dis- saging are less rich than face to face and collocated group tributed software project. Who or what is distributed and meetings. Developers may take longer to solve problems at what level? Are people or the artifacts distributed? Are because they lack the ability to step into a neighboring of- people dispersed individually or dispersed in groups? fice to ask for help. They may not even know the correct In addition, it is important to consider the way that de- person to contact at a remote site. velopers and other entities are distributed. The distribu- Coordination breakdowns occur due to this lack of commu- tion can be across geographical, organizational, temporal, nication and lower levels of group awareness [4, 1]. When and stakeholder boundaries [15]. A scenario involving one managers must manage across large distances, it becomes company outsourcing work to another will certainly dif- more difficult to stay aware of each person’s task and how fer from multiple teams working within the same company. they are interrelated. Different sites often use different tools A recent special issue of IEEE Software focused on glob- and processes which can also make coordinating between ally distributed development, but the majority of the papers sites difficult. dealt with offshoring relationships between separate com- panies and outsourcing, which is likely very different from Diversity in operating environments may cause manage- distributed sites within the same company [2, 6, 7]. Even ment problems [1]. Often there are relationships between within a company, the development may or may not span the organization doing development and external entities organizational structure at different levels. Do geographi- such as governments and third party vendors. In a geo- cal locations span the globe, including multiple time zones, graphically dispersed project, these entities will be different languages, and cultures or are they simply in different cities based on location (e.g. national policies on labor practices of the same state or nation? may differ between the U.S. and India). We are interested in studying the effect of globally dis- Distance can reduce team cohesion in groups collaborating tributed software development within the same company remotely [22]. Eating, sharing an office, or working late because there are many issues involved in outsourcing that together to meet a deadline all contribute to a feeling of be- are independent of geographical distribution (e.g. expertise ing part of a team. These opportunities are diminished by finding, different process and an asymmetric relationship). distance. Our main motivation is in confirming or refuting the no- Organizational and national cultural barriers may com- tion that global software development leads to more failures plicate globally distributed work [5]. Coworkers must be within the context of our setting. aware of cultural differences in communication behaviors. To our knowledge, this is the first large scale distributed One example of a cultural difference within Microsoft be- development study of its kind. This study augments the cur- came apparent when a large company meeting was origi- rent body of knowledge and differs from prior studies by nally (and unknowingly) planned on a major national holi- making the following contributions: day for one of the sites involved. 1. We examine distributed development at multiple levels Based on these prior observations and an examination of of separation (building, campus, continent, etc.). the hurdles involved in globally distributed development we 2. We examine a very large scale software development expect that difficulties in communication and coordination effort, composed of over 4,000 binaries and nearly will lead to an increase in the number of failures in code pro-
  • 3. duce by distributed teams over code from collocated teams. nication via instant messaging and group chat rather than We formulate our testable hypothesis formally. telephones, better ways of identifying experts to ask for in- H1: Binaries that are developed by teams of engineers formation, and shared calendars and other presence aware- that are distributed will have more post-release failures ness tools to facilitate contact with remote colleagues. than those developed by collocated engineers. Panjer et al [23] observed a portion of a geographically We are also interested to see if the binaries that are dis- distributed team of developers for two months in a com- tributed differ from the collocated counterparts in any sig- mercial setting and interviewed seven developers from the nificant ways. It is possible that managers, aware of the team. Developers reported difficulty coordinating with dis- difficulties mentioned above, may choose to develop sim- tributed coworkers and sometimes assigned MRs based on pler, less frequently changing, or less critical software in location. They also created explicit relationships between a distributed fashion. We therefore present our second hy- MRs to capture and manage technical information. pothesis. Thanh et al [21] examined the effect of distributed devel- H2: Binaries that are distributed will be less complex, opment on delay between communications and time to res- experience less code churn, and have fewer dependencies olution of work items in the IBM’s Jazz project, which was than collocated binaries. developed by developers at five globally distributed sites. They categorized work items based on the number of dis- 3. Related Work tributed sites that contributed to their resolution and exam- ined the mean and median time to resolution and time be- tween comments on each work item. While Kruskal-Wallis There is a wealth of literature in the area of globally dis- tests showed a statistically significant difference in the times tributed software development. It has been the focus of mul- for items that were more distributed, the Kendall Tau corre- tiple special issues of IEEE Software, workshops at ICSE lations of time to resolution and time between comments and the International Conference on Global Software Engi- with number of sites was extremely low (below 0.1 in both neering. Here we survey important work in the area, includ- cases). This indicates that the effect of distributed collabo- ing both studies and theory of globally distributed work in ration does not have a strong effect. software development. There have been a number of experience reports for In [13], Herbsleb and Mockus formulate an empirical globally distributed software development projects at var- theory of coordination in software engineering and test hy- ious companies including Siemens [14], Alcatel [8], Mo- potheses based on this theory. They precisely define soft- torola [1], Lucent [11], and Philips [17]. ware engineering as requiring a sequence of choices for all of the decisions associated with a project. Each decision Effects on bug resolution constrains the project and future decisions in some way un- In an empirical study of globally distributed software de- til all choices have been made and the final product does velopment [12], Herbsleb and Mockus examined the time to or does not satisfy the requirements. It is therefore impor- resolution of Modification Requests (MRs) in two depart- tant that only feasible decisions (those which will lead to ments of Lucent working on distinct network elements for a project that does satisfy the requirements) be made. The a telecommunication system. They found that when an en- presented theory is used to develop testable hypotheses re- gineer was blocked on a task due to information needs, the garding productivity, measured as number of MRs resolved average delay was .9 days for same site information transfer per unit time. They find that people who are assigned work and 2.4 days if the need crossed site-boundaries. An MR is from many sources have lower productivity and that MRs classified as ”single-site” if all of contributors were resided that require work in multiple modules have a longer cycle at one site and ”distributed” otherwise. The average time time than those which require changes to just one. needed to complete an ”single-site” MR was 5 days versus Unlike the above papers, our study focuses on the ef- 12.7 for ”distributed”. When controlling for other factors fect of distributed development on defect occurrence, rather such as number of people working on an MR, how diffused than on defect resolution time. the changes are across the code base, size of the change, and Effects on quality and productivity severity, the effect of being distributed was no longer sig- nificant. They hypothesize that large and/or multi-module Diomidis Spinellis examined the effect of distributed de- changes are both more time consuming and more likely to velopment on productivity, code style, and defect density in involve multiple sites. In addition, these changes require the FreeBSD code base [25]. He measured the geographi- more people, which introduce delay. They conclude that cal distance between developers, the number of defects per distributed development indirectly introduces delay due to source file, as well as productivity in terms of number of correlated factors such as team size and breadth of changes lines committed per month. A correlation analysis showed required. that there is little, if any, relationship between geographic They also introduce practices that may mitigate the risks distance of developers and productivity and defect density. of distributed development. These include better commu- It should be noted that this is a study of open source soft-
  • 4. ware which is, by its very nature, distributed and has a very Development Locations Code Size Defect Density different process model from commercial software. (KLOC C/C++) ( defects/KLOC) Cusick and Prasad [6] examined the practices used by Beijing, China 57 0.7 Arlington Heights, US 54 0.8 Wolters Kluwer Corporate Legal Services when outsourc- Arlington Heights, US 74 1.3 ing software development tasks and present their model for Tokyo, Japan 56 0.2 deciding if a project is offshorable and how to manage it ef- Bangalore, India 165 0.5 fectively. Their strategies include keeping communication Singapore 45 0.1 channels open, using consistent development environments Adelaide, Australia 60 0.5 and machine configurations, bringing offshore project leads onsite for meetings, developing and using necessary infras- Table 1. Locations, code size, and defect density from tructure and tools, and managing where the control and do- Motorola’s 3G trial project for each site main expertise lies. They also point out that there are some drawbacks that are difficult to overcome and should be ex- • Reduce national and organizational cultural distance pected such as the need for more documentation, more plan- • Reduce temporal distance ning for meetings, higher levels of management overhead, and cultural differences. This work was based on an off- Nagappan et al investigated the influence of organiza- shoring relationship with a separate vendor and not collab- tional structure on software quality in Windows Vista [20]. oration between two entities within the same company. We They found a strong relationship between how development expect that the challenges faced in distributed development is distributed across the organizational structure and number may differ based on the type of relationship between dis- of post-release failures in binaries shipped with the operat- tributed sites. ing system. Along with other organizational measures, they Our study examines distributed development in the con- measured the level of code ownership by the organization text of one commercial entity, which differs greatly from that the binary owner belonged to, the number of organiza- both open source projects and outsourcing relationships. tions that contributed at least 10% to the binary, and the or- ganizational level of the person whose reporting engineers Issues and solutions perform more than 75% of the edits. Our paper comple- In his book on global software teams [4], Carmel cate- ments this study by examining geographically, rather than gorizes project risk factors into four categories that act as organizationally distributed development. centrifugal forces that pull global projects apart. These are: 4. Methods and Analysis • Loss of communication richness • Coordination breakdowns In this section we describe our methods of gathering data • Geographic dispersion for our study and the analysis methods used to evaluate our • Cultural Differences hypotheses regarding distributed development in Windows Vista. In 2001, Battin et al [1] discuss the challenges and their solutions relative to each of Carmel’s categories in a large 4.1. Data Collection scale project implementing the 3G Trial (Third Generation Cellular System) at Motorola. By addressing these chal- lenges in this project, they found that globally distributed Windows Vista is a large commercial software project in- software development did not increase defect density, and volving nearly 3,000 developers. It comprises over 3,300+ in fact, had lower defect density than the industrial aver- unique binaries (defined as individual files containing ma- age. Table 1 lists the various locations, the size of the code chine code such as executables or a libraries) with a source developed at those locations, and their defect density. They code base of over 60 MLOC. Developers were distributed summarize the key actions necessary for success with global across 59 buildings and 21 campuses in Asia, Europe and development in order of importance: North America. Finally, it was developed completely in- house without any outsourced elements. • Use Liaisons Our data is focused on 3 properties: code quality, ge- • Distribute entire things for entire lifecycle ographical location, and code ownership. Our measure of • Plan to accommodate time and distance code quality is post-release failures, since these matter most to end-users. These failures are recorded for the six months Carmel and Agarwal [5] present three tactics for allevi- following the release of Vista at the binary level as shown ating distance in global software development, each with in figure 1. examples, possible solutions, and caveats: Geographical location information for each software de- veloper at Microsoft is obtained from the people manage- • Reduce intensive collaboration ment software at the time of release to manufacturing of
  • 5. Windows Vista Collect product metrics 6 months to Software (commits, code churn, complexity, etc.) collect failures Windows Server 2003 – Vista Release SP1 Time Figure 1. Data collection timeline Vista. This data includes the building, campus, region, tinction is that it is easy to travel between buildings on the country, and continent information. While some develop- same campus by foot while travel between campuses even ers occasionally move, it is standard practice at Microsoft in the same city requires a vehicle. to keep a software engineer at one location during an entire Locality: We use localities to represent groups of campuses product cycle. Most of the 2,757 developers of Vista didn’t that are geographically close to each other. For instance, move during the observation period. the Seattle locality contains all of the campuses in western Finally we gathered the number of commits made by Washington. It’s possible to travel within a locality by car each engineer to each binary. We remove build engineers on day trips, but travel between localities often requires air from the analysis because their changes are broad, but not travel and multi-day trips. Also, all sites in a particular lo- substantive. Many files have fields that need to be updated cality operate in the same time zone, making coordination prior to a build, but the actual source code is not modified. and communication within a locality easier than between By combining this data with developer geographical data, localities. we determine the level of distribution of each binary and Continent: All of the locations on a given continent fall categorize these levels into a hierarchy. Microsoft practices into this category. We choose to group at the continent level a strong code ownership development process. We found rather than the country level because Microsoft has offices that on average, 49% of the commits for a particular binary in Vancouver Canada and we wanted those to be grouped can be attributed to one engineer. Although we are basing together with other west coast sites (Seattle to Vancouver is our analysis on data from the development phase, in most less than 3 hours by road). If developers are located in the cases, this is indicative of the distribution that was present same continent, but not the same region, then it is likely that during the design phase as well. cultural similarities exists, but they operate in different time We categorized the distribution of binaries into the fol- zones and rarely speak face to face. lowing geographic levels. Our reasoning behind this classi- World: Binaries developed by engineers on different conti- fication is explained below. nents are placed in this category. This level of geographical distribution means that face to face meetings are rare and Building: Developers who work in the same building (and synchronous communication such as phone calls or online often the same floor) will enjoy more face to face and infor- chats are hindered by time differences. Also, cultural and mal contact. A binary classified at the building level may language differences are more likely. have been worked on by developers on different floors of the same building. For every level of geographical dispersion there are more Cafeteria: Several buildings share a cafeteria. One cafete- than two entities from the lower level within that level. That ria services between one and five nearby buildings. Devel- is, Vista was developed in more that three continents, local- opers in different, but nearby buildings, may ”share meals” ities, etc. Each binary is assigned the lowest level in the together or meet by chance during meal times. In addi- hierarchy from which at least 75% of the commits were tion, the typically shorter geographical distance facilitates made. Thus, if engineers residing in one region make at impromptu meetings. least 75% of the commits for a binary, but there is no cam- Campus: A campus represents a group of buildings in one pus that accounts for 75%, then the binary is categorized at location. For instance, in the US, there are multiple cam- the region level. This threshold was chosen based on results puses. Some campuses reside in the same city, but the dis- of prior work on development distributed across organiza-
  • 6. North America Redmond Charlotte Asia cmroute.dll Bldg 40 Hyderabad John 67 Bldg AP2 Sal 2 Bldg 1 Tom 10 cmroute.cpp Joe 16 Zach 3 Bob 10 cmroute.h Sarah 9 Bldg 26 Cindy 8 Jack 2 Fred 7 Silicon Valley etc. Lucy 2 Erin 6 Bldg 5 Jeff 2 Bldg 30 Lynn 2 Matt 2 Figure 2. Commits to the library cmroute.dll. For clarity, location of anonymized developers is shown only in terms of continents, regions, and buildings. tional boundaries that is standardized across Windows [20]. Levels of distribution Figure 2 illustrates the geographic distribution of commits A B C D E to an actual binary (with names anonymized). To assess the sensitivity of our results to this selection and address any Building Cafeteria Campus Locality Continent World threats to validity we performed the analysis using thresh- olds of 60%, 75%, 90%, and 100% with consistently similar Figure 4. distribution levels in Windows Vista results. Note that whether a binary is distributed or not is orthog- Figure 3 illustrates the hierarchy and shows the propor- onal to the actual location where it was developed. For in- tion of binaries that fall into each category. Note that the stance, some binaries that are classified at the building level majority of binaries have over 75% of their commits com- were developed entirely in a building in Hyderabad, India ing from just one building. The reason that so few binaries while others were owned in Redmond, Washington. fall into the continent level is that the Unites States is the only country which contains multiple regions. Although the proportion of binaries categorized above the campus level is barely 10%, this still represents a sample of over 380 bina- ries; enough for a strong level of statistical power. World 5.9% We initially examined the number of binaries and distri- bution of failures for each level of our hierarchy. In addi- Continent 0.2% tion, we divided the binaries into ”distributed” and ”col- located” categories in five different ways using splits A Locality 5.6% through E as shown in figure 4 (e.g. split B categorizes building and cafeteria level binaries as collocated and the rest as distributed). This was performed to see if there is Campus 17% a level of distribution above which there is a noticeable in- Cafeteria 2.3% crease in failures. These categorizations are used to determine if there is a level of distributedness above which there is a significant Building increase in the number of failures. The results from analysis 68% of these dichotomized data sets were consistent in nearly all respects. We therefore present the results of the first data set and point out deviations between the data sets where they occurred. 4.2. Experimental Analysis In order to test our hypothesis about the difference in Figure 3. Hierarchy of distribution levels in Windows code quality between distributed and collocated develop- Vista
  • 7. ment, we examined the distribution of the number of post- Model 2 F Statistic = 720.74, p < .0005 release failures per binary in both populations. Figure 5 Variable % increase Std Err. Significance shows histograms of the number of bugs for distributed (Constant) 0.25 p < .0005 and non-distributed binaries. Absolute numbers are omit- distributed 4.6% 0.25 p = .056 ted from the histograms for confidentiality, but the horizon- numdevs 0.00 p < .0005 tal and vertical scales are the same for both histograms. A visual inspection indicates that although the mass is differ- ent, with more binaries categorized as collocated than dis- efficient for all models were below 17%, and dropped even tributed, the distribution of failures are very similar. further to below 9% when controlling for number of devel- A Mann-Whitney test was used to quantitatively measure opers (many were below this value, but the numbers cited the difference in means because the number of failures was are upper bounds). In addition, the effect of distributed in not normally distributed [18]. The difference in means is models that accounted for the number of developers was statistically significant, but small. While the average num- only statistically significant when dividing binaries at the ber of failures per binary is higher when the binary was dis- continents level. tributed, the actual magnitude of the increase is only about We also used linear regression to examine the effect of 8%. In a prior study by Herbsleb and Mockus [12], time the level of distribution on the number of failures of a bi- to resolution of modification requests was positively corre- nary. Since the level of distribution is a nominal variable lated with the level of distribution of the participants. After that can take on six different values, we encode it into five further analysis, they discovered that the level of distribu- binary variables. The variable diff buildings is 1 if the bi- tion was not significant when controlling for the number of nary was distributed among different buildings that all were people participating. We performed a similar analysis on served by the same cafeteria and 0 otherwise. A value of 1 ourWe used linear regression to examine the effect of dis- data. for diff cafeterias means that the binary was developed by tributed development on number of failures. Our initial engineers who were served by multiple cafeterias, but were model contained only the binary variable indicating whether located on the same campus, etc. The percentage increase or not the binary was distributed. The number of develop- for each diff represents the increase in failures relative to bi- ers working on a binary was then added to the model and naries that are developed by engineers in the same building. we examined the coefficients in the model and amount of variance in failures explained by the predictor variables. In Model 3 F Statistic = 25.48, p < .0005 these models, distributed is a binary variable indicating if Variable % increase Std Err. Significance the binary is distributed and numdevs is the number of de- (Constant) 0.09 p < .0005 velopers that worked on the binary. We show here the re- diff buildings 15.1% 0.50 p < .0005 sults of analysis when splitting the binaries at the regions diff cafeterias 16.3% 0.21 p < .0005 level. The F-statistic and p value show how likely the null diff campuses 12.6% 0.35 p < .0005 hypothesis (the hypothesis that the predictor variable has no diff localities 2.6% 1.47 p = .824 effect on the response variable) is. We give the percentage diff continents -5.1% 0.31 p = .045 increase in failures when the binaries are distributed based on the parameter values. As numdevs is only included in The parameter estimates of the model indicate that bi- the models to examine effect of distribution when control- naries developed by engineers on the same campus served ling for number of developers we do not include estimates by different cafeterias have, on average, 16% more post- or percentage increase. release failures than binaries developed in the same build- In models 1 - 4 we give the percentage increase in fail- ing. Interestingly, the change in number of failures is quite ures when the binaries are distributed based on the parame- low for those developed in multiple regions and continents. ter values also. However, when controlling for development team size, only binaries categorized at the levels of different cafeterias and Model 1 F Statistic = 12.43, p < .0005 different campuses show a statistically significant increase Variable % increase Std Err. Significance in failures over binaries developed in the same building. (Constant) 0.30 p < .0005 Even so, the actual effects are relatively minor (4% and 6% distributed 9.2% 0.31 p < .0005 respectively). Two important observations can be made from these This indicates that on average, a distributed binary has models. The first is that the variance explained by the pre- 9.2% more failures than a collocated binary. However, the dictor variables (as measured in the adjusted R2 value) for result changes then controlling for the number of developers the built models rises from 2% and 4% (models 1 and 3) working on a binary. to 33% (models 2 and 4) when adding the number of de- We performed this analysis on all five splits of the bina- velopers. The second is that when controlling for the num- ries as shown in figure 4. The estimates for distributed co- ber of developers, not all levels of distribution show a sig-
  • 8. Post-Release Failures Distributed Collocated Number of Binaries Failures Failures Figure 5. Histograms of the number of failures per binary for distributed (left) and collocated (right) binaries. Although numbers are not shown on the axes, the scales are the same in both histograms. Model 4 F Statistic = 242.73, p < .0005 about which binaries should be developed in a distributed Variable % increase Std Err. Significance manner. For instance, prior work has shown that the num- (Constant) 0.09 p < .0005 ber of failures is highly correlated with code complexity and diff buildings 2.6% 0.42 p = .493 number of dependencies [19, 26]. Therefore, it is possible diff cafeterias 3.9% 0.18 p = .016 that in an effort to mitigate the perceived dangers of dis- diff campuses 6.3% 0.29 p = .019 tributed development, only less complex binaries or those diff localities 8.3% 1.23 p = .457 with less dependencies were chosen. diff continents -3.9% 0.26 p = .101 We also gathered metrics for each of the binaries in an numdevs 0.00 p < .0005 attempt to determine if there is a difference in the nature of binaries that are distributed. These measures fall into 5 broad categories. nificant effect, but the increase in post-release failures for those that do is minimal with values at or below 6%. To put Size & Complexity: Our code size and complexity mea- this into perspective, a binary with 4 failures if collocated sures include number of independent paths through the would have 4.24 failures if distributed. Although our re- code, number of functions, classes, parameters, blocks, sponse variable is different from Herbsleb and Mockus, our lines, local and global variables, and cyclomatic complex- findings are consistent with their result that when control- ity. From the call graph we extract the fan in and fan out ling for the number of people working on a development of each function. For object oriented code we include mea- task, distribution does not have a large effect. Based on sures of class coupling, inheritance depth, the number of these results, we are unable to reject the null hypothesis and base classes, subclasses and class methods, and the number H1 is not confirmed. of public, protected, and private data members and methods. This leads to the surprising conclusion that in the context All of these are measured as totals for the whole binary and in which Windows Vista was developed, teams that were as maximums on a function or class basis as applicable. distributed wrote code that had virtually the same number Code Churn: As measures of code churn we examine the of post-release failures as those that were collocated. change in size of the binary, the frequency of edits and the churn size in terms of lines removed, added, and modified 4.3. Differences in Binaries from the beginning of Vista development to RTM. Test Coverage: The number of blocks and arcs as well as One possible explanation for this lack of difference in the block coverage and arc coverage are recorded during the failures could be that distributed binaries are smaller, less testing cycle for each binary. complex, have fewer dependencies, etc. Although the num- Dependencies: Many binaries have dependencies on one ber of failures changes only minimally when the binaries are another (in the form of method calls, data types, registry distributed, we are interested in the differences in character- values that are read or written, etc.). We calculate the num- istics between distributed and non-distributed binaries. This ber of direct incoming and outgoing dependencies as well was done to determine if informed decisions were made as the transitive closer of these dependencies. The depth in
  • 9. the dependency graph is also recorded. in culture and business context. These observations come People: We include a number of statistics on the people and from discussions from management as well as senior and organizations that worked on the binaries. These include all experienced employees at many of the development sites. of the metrics in our prior organizational metrics paper [20] Relationship Between Sites: Much of the work on such as the number of engineers that worked on the binary. distributed development examines outsourcing relation- We began with a manual inspection of the 20 binaries ships [7, 2]. Other work has looked at strategic partner- with the least and 20 binaries with the most number of post- ships between companies or scenarios in which a foreign release failures in both the distributed and non-distributed remote site was recently acquired [11]. All of these create categories and examined the values of the metrics described situations where there the relationships are asymmetric and above. The only discernible differences were metrics rela- engineers at different sites may feel competitive or may for tive to the number of people working on the code, such as other reasons be less likely to help each other. In our situ- team size. ation, each of the sites has existed for a long time and has Metric Avg Value Correlation Significance worked on software together for many years. There is no Functions 895.86 0.114 p < .0005 Complexity 4603.20 0.069 p < .0005 threat that if one site performs better, the other will be shut Churn Size 53430 0.057 p = .033 down. The pay scale and benefits to employees are equiva- Edits 63.82 0.134 p < .0005 lent at all sites in the company. Indegree 13.04 -0.024 p = .363 Cultural Barriers: In a study of distributed development Outdegree 9.67 0.100 p < .0005 within Lucent at sites in Great Britain and Germany, Herb- Number of Devs 21.55 0.183 p < .0005 sleb and Grinter [11] found that significant national cultural We evaluated the effect of these metrics on level of dis- barriers existed. These led to a lack of trust between sites tribution in the entire population by examining the spear- and misinterpreted actions due to lack of cultural context. man rank correlation of distribution level of binaries (not This problem was alleviated when a number of engineers limited to the ”top 20” lists) with the code metrics. Most (liaisons) from one site visited another for an extended pe- metrics had correlation levels below 0.1 and the few that riod of time. Battin et al. [1] found that when people from were above that level, such as number of engineers never different sites spent time working together in close prox- exceeded 0.25. Logistic regression was used to examine imity, many issues such as trust, perceived ability, delayed the relationship of the development metrics with distribu- response to communication requests, etc. were assuaged. tion level. The increase in classification accuracy between a A similar strategy was used during the development of naive model including no independent variables and a step- Vista. Development occurred mostly in the US (predom- wise refined model with 15 variables was only 4%. When inantly in Redmond) and Hyderabad, India. In the initial removing data related to people that worked on the source, development phases, a number of engineers and executives the refined model’s accuracy only improved 2.7% from the left Redmond to work at the Indian site. These people had a naive model. We include the average values for a repre- long history, many with 10+ years within Microsoft. They sentative sample of the metrics along with a spearman rank understood the company’s development process and had correlation with the level of distribution for the binaries and domain expertise. In addition, the majority of these em- the significance of the correlation. Although the p-values ployees were originally from India, removing one key chal- are quite low, the magnitude of the correlation is small. lenge from globally distributed work. These people could This is attributable to the very large sample of binaries (over therefore act as facilitators, information brokers, recom- 3,000). menders, and cultural liaisons [5] and had already garnered All of these results lead to the conclusion that there is a high level of trust and confidence from the engineers in no discernible difference in the measured metrics between the US. Despite constituting only a small percent of the In- binaries that are distributed and those that aren’t. dian workforce, they helped to reduce both organizational and national cultural distances [5]. 5. Discussion Communication: Communication is the single most refer- enced problem in globally distributed development. Face We have presented an unexpected, but very encourag- to face meetings are difficult and rare and people are less ing result: it is possible to conduct in-house globalized dis- likely to communicate with others that they don’t know per- tributed development without adversely impacting quality. sonally. In addition, distributed sites are more likely to It is certainly important to understand why this occurred, use asynchronous communication channels such as email and how this experience can be repeated in other projects which introduce a task resolution delay [24]. A prior study and contexts. To prime this future endeavor, in this section, of global software servicing within Microsoft has examined we make some observations concerning pertinent practices the ways in which distributed developers overcome com- that could have improved communication, co-ordination, munication difficulties and information needs through the team cohesion, etc., and reduced the impact of differences use of knowledge bases and requesting help from local and
  • 10. remote personnel [3]. dispersed developers to be more integrated into the com- The Vista developers made heavy use of synchronous pany and the project. The manager can act as a facilitator communication daily. Employees took on the responsibility between engineers who may be familiar with one another of staying at work late or arriving early for a status confer- and can also spot problems due to poor coordination ear- ence call on a rotating basis, changing the site that needed to lier than in an organizational structure based on geography keep odd hours every week. Keeping in close and frequent where there is little coupling between sites. Prior work has contact increases the level of awareness and the feeling of shown that organizationally distributed development dra- ”teamness” [1, 5]. This also helps to convey status and re- matically affects the number of post-release defects [20]. solve issues quickly before they escalate. In addition, En- This organizational integration across geographic bound- gineers also regularly traveled between remote sites during aries reconciles the results of that work with the conclusions development for important meetings. reached in this study. In addition, because the same process Consistent Use of Tools: Both Battin [1] and Herbsleb [12] has been used in all locations of the company for some time, cite the importance of the configuration management tools organizational culture is fairly consistent across geography. used. In the case of Motorola’s project, a single, distributed configuration management tool was used with great suc- 6. Threats to Validity cess. At Lucent, each site used their own management tools, which led to an initial phase of rapid development at the Construct Validity cost of very cumbersome and inefficient integration work The data collection on a system the size of Windows towards the end. Microsoft employs the use of one config- Vista is automated. Metrics and other data were collected uration management and build system throughout all of its using production level quality tools and we have no reason sites. Thus every engineer is familiar with the same source to believe that there were large errors in measurement. code management tools, development environment, docu- Internal Validity mentation method, defect tracking system, and integration In section 5 we listed observations about the distributed process. In addition, the integration process for code is in- development process used at Microsoft. While we have rea- cremental, allowing problems to surface before making it son to believe that these alleviate the problems associated into a complete build of the entire system. with distributed development, a causal relationship has not End to End Ownership: One problem with distributed de- been empirically shown. Further study is required to deter- velopment is distributed ownership. When an entity fails, mine to what extent each of these practices actually helps. needs testing, or requires a modification, it may not be clear In addition, although we attempted an exhaustive search of who is responsible for performing the task or assigning the differences in characteristics between distributed an collo- work. Battin mentions ownership of a component for the cated binaries, it’s possible that they differ in some way not entire lifecycle as one of three critical strategies when dis- measured by our analysis in section 4.3. tributing development tasks. While there were a number External Validity of binaries that were committed to from different sites dur- It is unclear how well our results generalize to other sit- ing the implementation phase of Vista, Microsoft practices uations. We examine one large project and there is a dearth strong code ownership. Thus, one developer is clearly ”in of literature that examines the effect of distributed devel- control” of a particular piece of code from design, through opment on post-release failures. We have identified simi- implementation, and into testing and maintenance. Effort larities in Microsoft’s development process with other suc- is made to minimize the number of ownership changes that cessful distributed projects, which may indicate important occur during development. principles and strategies to use. There are many ways in Common Schedules: All of the development that we ex- which distributed software projects may vary and the par- amined was part of one large software project. The project ticular characteristics must be taken into account. For in- was not made up of distributed modules that shipped sepa- stance, we have no reason to expect that a study of an out- rately. Rather Vista had a fixed release date for all parties sourced project would yield the same results as ours. It is and milestones were shared across all sites. Thus all engi- also not clear how these results relate to defect fixing in the neers had a strong interest in working together to accom- distributed world as mentioned in prior work[3]. plish their tasks within common time frames. Organizational Integration: Distributed sites in Microsoft 7. Conclusion do not operate in organizational isolation. There is no top level executive at India or China that all the engineers in In our study we divide binaries based on the level of ge- those locations report to. Rather, the organizational struc- ographic dispersion of their commits. We studied the post- ture spans geographical locations at low levels. It is not un- release failures for the Windows Vista code base and con- common for engineers at multiple sites may to have a com- cluded that distributed development has little to no effect. mon direct manager. This, in turn, causes geographically We posit that this negative result is a significant finding as
  • 11. it refutes, at least in the context of Vista development, con- [13] J. D. Herbsleb and A. Mockus. Formulation and prelimi- ventional wisdom and widely held beliefs about distributed nary test of an empirical theory of coordination in software development. When coupled with prior work, [1, 12] our engineering. In ESEC/FSE-11: Proceedings of 11th ACM results support the conclusion that there are scenarios in SIGSOFT International Symposium on Foundations of Soft- which distributed development can work for large software ware Engineering, pages 138–137, Helsinki, Finland, 2003. [14] J. D. Herbsleb, D. J. Paulish, and M. Bass. Global software projects. Based on earlier work [20] our study shows that development at siemens: experience from nine projects. In Organizational differences are much stronger indicators of Proceedings of the 27th International Conference on Soft- quality than geography. An Organizational compact but ge- ware Engineering, pages 524–533. ACM, 2005. ographically distributed project might be better than an ge- [15] H. Holmstrom, E. Conchuir, P. Agerfalk, and B. Fitzgerald. ographically close organizationally distributed project. We Global software development challenges: A case study on have presented a number of observations about the devel- temporal, geographical and socio-cultural distance. Pro- opment practices at Microsoft which may mitigate some of ceedings of the IEEE International Conference on Global the hurdles associated with distributed development, but no Software Engineering (ICGSE’06)-Volume 00, pages 3–11, causal link has been established. There is a strong simi- 2006. larity between these practices and those that have worked [16] IEEE. IEEE Standard 982.2-1988, IEEE Guide for the Use of IEEE Standard Dictionary of Measures to Produce Reli- for other teams in the past [1] as well as solutions proposed able Software. 1988. in other work [11]. Directly examining the effects of these [17] R. Kommeren and P. Parviainen. Philips experiences in practices is an important direction for continued research in global distributed software development. Empirical Soft- globally distributed software development. ware Engineering, 12(6):647–660, 2007. [18] H. B. Mann and W. D. R. On a test of whether one of two References random variables is stochastically larger than the other. An- nals of Mathematical Statistics, 18(1):50–60, 1947. [19] N. Nagappan, T. Ball, and A. Zeller. Mining metrics to pre- [1] R. D. Battin, R. Crocker, J. Kreidler, and K. Subramanian. dict component failures. In Proceedings of the International Leveraging resources in global software development. IEEE Conference on Software Engineering, pages 452–461, 2006. Software, 18(2):70–77, March/April 2001. [20] N. Nagappan, B. Murphy, and V. Basili. The influence of [2] J. M. Bhat, M. Gupta, and S. N. Murthy. Overcoming organizational structure on software quality: An empirical requirements engineering challenges: Lessons from off- case study. In ICSE ’08: Proceedings of the 30th interna- shore outsourcing. IEEE Software, 23(6):38–44, Septem- tional conference on Software engineering, pages 521–530, ber/October 2006. Leipzig, Germany, 2008. [3] S. Bugde, N. Nagappan, S. Rajamani, and G. Ramalingam. [21] T. Nguyen, T. Wolf, and D. Damian. Global software devel- Global software servicing: Observational experiences at mi- opment and delay: Does distance still matter? In Proceed- crosoft. In IEEE International Conference on Global Soft- ings of the International Conference on Global Software En- ware Engineering, 2008. gineering, 2008. [4] E. Carmel. Global Software Teams: Collaborating Across [22] G. M. Olson and J. S. Olson. Distance matters. Human- Borders and Time Zones. Prentice Hall, 1999. Computer Interaction, 15(2/3):139–178, 2000. [5] E. Carmel and R. Agarwal. Tactical approaches for allevi- [23] L. D. Panjer, D. Damian, and M.-A. Storey. Cooperation ating distance in global software development. IEEE Soft- and coordination concerns in a distributed software devel- ware, 2(18):22–29, March/April 2001. opment project. In Proceedings of the 2008 International [6] J. Cusick and A. Prasad. A practical management and engi- Workshop on Cooperative and Human Aspects of Software neering approach to offshore collaboration. IEEE Software, Engineering, pages 77–80. ACM, 2008. 23(5):20–29, September/October 2006. [7] K. C. Desouza, Y. Awaza, and P. Baloh. Managing knowl- [24] M. Sosa, S. Eppinger, M. Pich, D. McKendrick, S. Stout, edge in global software development efforts: Issues and T. Manage, and F. Insead. Factors that influence techni- practices. IEEE Software, 23(5):30–37, Sept/Oct 2006. cal communication in distributed product development: an [8] C. Ebert and P. D. Neve. Surviving global software develop- empirical study in the telecommunications industry. Engi- ment. IEEE Software, 18(2):62–69, 2001. neering Management, IEEE Transactions on, 49(1):45–58, [9] D. C. Gumm. Distribution dimensions in software develop- 2002. ment projects: a taxonomy. IEEE Software, 23(5):45–51, [25] D. Spinellis. Global software development in the freebsd 2006. project. In GSD ’06: Proceedings of the 2006 international [10] J. Herbsleb. Global Software Engineering: The Future of workshop on Global software development for the practi- Socio-technical Coordination. International Conference on tioner, pages 73–79, Shanghai, China, 2006. Software Engineering, pages 188–198, 2007. [26] T. Zimmermann and N. Nagappan. Predicting defects using [11] J. Herbsleb and R. Grinter. Architectures, coordination, network analysis on dependency graphs. In Proceedings of and distance: Conway’s law and beyond. IEEE Software, the 30th International Conference on Software Engineering, 16(5):63–70, 1999. pages 531–540. ACM, 2008. [12] J. Herbsleb and A. Mockus. An empirical study of speed and communication in globally distributed software de- velopment. IEEE Transactions on Software Engineering, 29(6):481–94, 2003.