Continuous Integration (CI) and Continuous Delivery
(CD) are widespread in both industrial and open-source
software (OSS) projects. Recent research characterized build
failures in CI and identified factors potentially correlated to them. However, most observations and findings of previous work are exclusively based on OSS projects or data from a single industrial organization. This paper provides a first attempt to compare the CI processes and occurrences of build failures in 349 Java OSS projects and 418 projects from a financial organization, ING Nederland. Through the analysis of 34,182 failing builds (26% of the total number of observed builds), we derived a taxonomy of failures that affect the observed CI processes. Using cluster analysis, we observed that in some cases OSS and ING projects share similar build failure patterns (e.g., few compilation failures as compared to frequent testing failures), while in other cases completely different patterns emerge. In short, we explain how OSS and ING CI processes exhibit commonalities, yet are substantially different in their design and in the failures they report.
A Tale of CI Build Failures: an Open Source and a Financial Organization Perspective
1. A tale of ci build failures: an Open Source and a Financial Organization
Perspective
Carmine Vassallo, Gerald Schermann, Fiorella Zampetti, Daniele Romano, Philipp Leitner,
Andy Zaidman, Massimiliano Di Penta, Sebastiano Panichella
1
@ccvassallovassallo@ifi.uzh.ch
2. Continuous Delivery is a
software development
discipline where you
build software in such a
way that the software
can be released to
production at any time.
(Martin Fowler)
2
3. Continuous Delivery is a
software development
discipline where you
build software in such a
way that the software
can be released to
production at any time.
(Martin Fowler)
3
6. • On average, a build failure takes
57 min to fix.
• The overall cost of build failures
ranging from 904.64 to 2034.92
man-hours (over a period of 6
months).
• They monitored roughly 7200
man-hours.
. . Kerzazi, F. Khomh, and B. Adams, Why do automated builds break? an empirical study, in 30th IEEE International Conference on Software
Maintenance and Evolution (ICSME), pp. 41–50, IEEE, 2014.
Relevance of Build Breakage 6
6
7. Build breaks types 7
Failing tests are the dominant reason for unsuccessful builds
(Rausch et al., MSR 2017) (Beller et al., MSR 2017)
IndustryOSS
40% of the failures occur during static analysis
(Miller et al., AGILE 2008)
Dependencies between components are the most relevant
cause of compilation related failures
(Seo et al., ICSE 2014)
7
8. 8
What are differences and
commonalities in the
distribution of build failure
types occurring in OSS and
industry?
8
10. 10
RQ1 What types of failures affect builds
of OSS and industrial projects?
RQ2 How frequent are the different
types of build failures in the observed
OSS and industrial projects?
10
11. Data Selection 11
• 418 Maven (mostly Java)
projects
• 12,871 builds of which 3,390
(≈ 26%) failed.
• 349 Maven (Java) projects
• 116,741 builds, of which 30,792
(≈ 26%) failed
11
IndustryOSS
12. 12
Data Selection
• 418 Maven (mostly Java)
projects
• 12,871 builds of which 3,390
(≈ 26%) failed.
• 349 Maven (Java) projects
• 116,741 builds, of which 30,792
(≈ 26%) failed
IndustryOSS
Build failure logs were the only resources we could
access
12
14. 14
Data Preprocessing
14
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on project es-common:
Compilation failure
[ERROR] /home/travis/build/zhangkaitao/es/common/src/main/java/com/sishuok/es/common/utils/html/HTMLUtils.java:[14,8] class
HtmlUtils is public, should be declared in a file named HtmlUtils.java
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
17. 17
Clean
Validation
Preprocessing
Compilation (production)
Compilation (test)
Compilation (support)
Testing (unit)
Testing (integration)
Testing (non functional)
Testing (crosscutting)
Packaging
Static Analysis
Dynamic Analysis
Deployment (Local)
Deployment (Remote)
Documentation
Release Preparation
Support
External Tasks
Dependencies
% of build failures
0% 5% 10% 15% 20% 25%
7.1%
1.4%
0.9%
0.0%
0.9%
0.5%
0.0%
0.2%
4.2%
0.8%
8.3%
0.0%
5.0%
28.0%
0.2%
1.8%
7.1%
1.3%
0.5%
0.0%
6.3%
8.8%
0.0%
21.1%
0.3%
10.0%
0.4%
0.0%
16.4%
2.1%
18.3%
2.7%
13.3%
5.2%
0.0%
2.3%
4.2%
0.0%
0.0%
0.0%
Org OSS
RQ2 How frequent are
the different types of
build failures in the
observed OSS and
industrial projects?
Industry
17
18. 18
Compilation (production)
Compilation (test)
Compilation (support)
% of build failures
0% 5% 10% 15% 20% 25%
0.2%
1.8%
7.1%
0.0%
2.3%
4.2%
ING OSS
Compilation Errors are fairly limited
Industry
RQ2 How frequent are the different types of build
failures in the observed OSS and industrial projects?
18
19. 19
Testing (unit)
Testing (integration)
Testing (non functional)
% of build failures
0% 5% 10% 15% 20% 25%
0.0%
5.0%
28.0%
2.7%
13.3%
5.2%
ING OSS
OSS projects exhibit more unit than
integration testing related failures.
In industry it’s the opposite.
Early discovery of non-functional failures in
industry.
RQ2 How frequent are the different types of build
failures in the observed OSS and industrial projects?
Industry
19
20. 20
Static Analysis
% of build failures
0% 5% 10% 15% 20% 25%
4.2%
16.4%
ING OSS
Static Analysis tools: on CI server in OSS,
remotely in industry.
RQ2 How frequent are the different types of build
failures in the observed OSS and industrial projects?
Industry
20
21. 21
Deployment (Local)
Deployment (Remote)
Release Preparation
% of build failures
0% 5% 10% 15% 20% 25%
0.0%
0.5%
0.0%
21.1%
10.0%
0.4%
ING OSS
Release preparation and deployment failures
are very common in industry, less so in OSS
RQ2 How frequent are the different types of build
failures in the observed OSS and industrial projects?
Industry
21
22. 22
• Projects clustering (using K-means
algorithm)
• Optimal value of silhouette statistic: 6
• Each cluster dominated either by
industrial or oss projects
• except Dependencies
#projects
0
45
90
135
180
Cluster
CodeAnalysis
ReleasePreparation
Dependencies
CrosscuttingTesting
UnitTesting
Compilation
ING OSS
25%
75% 100% 59%
41% 85%
31%
69%
94%
6%
15%
22
RQ2 How frequent are the different types of build
failures in the observed OSS and industrial project?
Industry
23. 23
• Dependencies related failures occur with the same frequency.
Key findings
• In OSS projects a lot of failures are due to unit testing: try to catch those
issues earlier!
• Need for a better release/deployment strategy in OSS.
• Static analysis on separate server: well collected data and less overloading of
CI server.
• Towards early discovery of non functional testing failures.
23
24. @ccvassallovassallo@ifi.uzh.ch
24
Build breaks types X
Failing tests are the dominant reason for unsuccessful builds
(Rausch et al., MSR 2017) (Beller et al., MSR 2017)
IndustryOSS
40% of the failures occur during static analysis
(Miller et al., AGILE 2008)
Dependencies between components are the most relevant
cause of compilation related failures
(Seo et al., ICSE 2014)
X
X
RQ1 What types of failures affect builds
of OSS and industrial projects?
RQ2 How frequent are the different
types of build failures in the observed
OSS and industrial projects?
X
X
CLEAN
VALIDATION
PRE-PROCESSING (RESOURCES)
COMPILATION
TESTING
PACKAGING
DOCUMENTATION
SUPPORT
EXTERNAL TASKS
CODE ANALYSIS
RELEASE PREPARATION
DEPLOYMENT
DEPENDENCIES
PRODUCTION
TEST
SUPPORT
UNIT TESTING
INTEGRATION TESTING
NON FUNCTIONAL TESTING
CROSSCUTTING
STATIC
DYNAMIC
LOCAL
REMOTE
RQ1 What type of failures affect builds of OSS and
industrial projects?
X
X
Clean
Validation
Preprocessing
Compilation (production)
Compilation (test)
Compilation (support)
Testing (unit)
Testing (integration)
Testing (non functional)
Testing (crosscutting)
Packaging
Static Analysis
Dynamic Analysis
Deployment (Local)
Deployment (Remote)
Documentation
Release Preparation
Support
External Tasks
Dependencies
% of build failures
0% 5% 10% 15% 20% 25%
7.1%
1.4%
0.9%
0.0%
0.9%
0.5%
0.0%
0.2%
4.2%
0.8%
8.3%
0.0%
5.0%
28.0%
0.2%
1.8%
7.1%
1.3%
0.5%
0.0%
6.3%
8.8%
0.0%
21.1%
0.3%
10.0%
0.4%
0.0%
16.4%
2.1%
18.3%
2.7%
13.3%
5.2%
0.0%
2.3%
4.2%
0.0%
0.0%
0.0%
Org OSS
RQ2 How frequent are
the different types of
build failures in the
observed OSS and
industrial projects?
Industry
X