SlideShare ist ein Scribd-Unternehmen logo
1 von 27
Noise and Heterogeneity in
Historical Build Data:
An Empirical Study of Travis CI
Keheliya Gallaba Shane McIntoshChristian Macho Martin Pinzger
@keheliya
keheliya.github.io
@Mitschiiii
mitschi.github.io
@pinzger
pinzger.github.io
@shane_mcintosh
shanemcintosh.org
Source Code Automated builds
check the impact of
changes on the
software product
Build System
Deliverables
2
Build outcome data is used to solve software
engineering research problems
For
understanding
and predicting
build breakage
For
measuring
the build
breakage rate
For
communicating
the current
build status
3
Build outcome
data is
nuanced!
allow_failure enables experimentation with support for a new platform. 4
Can the off-the-shelf
historical CI build data be
trusted?
The zdavatz/spreadsheet
project has had the
allow_failure feature enabled
for the entire lifetime of the project!
5
Are build
outcomes
free of
noise?
Are build
outcomes
homogeneous?
6
We study 680,209 Travis CI builds spanning 1,276
open source projects
We follow Mockus' four-step procedure
7
Are build
outcomes
free of
noise?
8
We look for passing builds with actively ignored failures
9
680,209
Builds
496,204
Builds
59,904
Builds
Select
passing
builds
Select
builds
with failing jobs
Check if the allow_failure
property is enabled for the
failing jobs in .travis.yml
Passing build outcomes do not always indicate that
the build was entirely clean
12% of passing
builds have an
actively ignored
failure.
Up to 87% of the
jobs are actively
ignored.
10
Passively ignored breakages may introduce noise
when all breakages are assumed to be distracting
11
680,209
Builds
610,550
Builds
Build
filtering
Graph construction using
version control data
Graph
analysis
Long breakage sequences may
mean developers passively ignored
failures by not immediately fixing
them.
In some cases, builds can remain broken for 423 days
Overall median length of the failure sequence is five commits. 12
One of the reasons for ignoring a build breakage:
Staleness
13
Developers may become
desensitized to stale* breakages.
*If the project has encountered a given
breakage in the past it's a stale breakage.
14
Maven
Build Log
Build fails due to the
same reason as a
prior failure?
Stale
Breakage
We measure staleness in Maven build breakages
Failure details are
equal to a prior
failure?
Not Stale
Breakage
YES YES
NONO
Maven Log Analyzer
Two of every three build breakages (67%) that we
analyze are stale
15
We propose
Signal-To-Noise Ratio to
quantify the proportion
of noise
16
Has Ignored
Breakages
No Ignored
Breakages
Broken
Builds
False Build
Breakages
True Build
Breakages
Passing
Builds
False Build
Successes
True Build
Successes
SignalNoise
One in every 7 to 11 builds (9%-14%) is incorrectly labelled
17
Noise may influence analyses
based on build outcome data
18
Passing build outcomes do not
always indicate that the build was
entirely clean
Build breakages can persist for up
to 485 commits (423 days)
67% of build breakages we analyze
are stale
9%-14% of builds are incorrectly
labelled
Are build
outcomes
homogeneous?
19
Noise may influence analyses
based on build outcome data
Passing build outcomes do not
always indicate that the build was
entirely clean
Build breakages can persist for up
to 485 commits (423 days)
67% of build breakages we analyze
are stale
9%-14% of builds are incorrectly
labelled
MBP<1
Environment-specific
breakages
Environment-agnostic
breakages
20
Computing the Matrix Breakage Purity
MBP=1
Environment-specific breakage is commonplace
21
Builds can break for various reasons
22
Compilation
Failure
Test
Failure
Dependency
Resolution
Failure
We extend Maven Log Analyzer to parse and classify broken
Maven build logs by type
Deployment
Failure
Maven Log Analyzer supports new
build breakage categories
23
Ant Inside
Maven
Goal Failed Broken Outside Maven
Run System/Java
Program
Run Jetty
Server
Manage Ruby
Gems
Polyglot for
Maven
No Log
Available
Failed Before
Maven
Travis
Aborted
Failed After
Maven
Travis
Cancelled
Tool-specific breakage is rare.
24
41% of the broken builds failed due to problems
outside of Maven.
25
Noise may influence analyses
based on build outcome data
Passing build outcomes do not
always indicate that the build was
entirely clean
Build breakages can persist for up
to 485 commits (423 days)
67% of build breakages we analyze
are stale
9%-14% of builds are incorrectly
labelled
Build outcomes are heterogenous
Environment-specific breakage is
commonplace
Tool-specific breakage is rare
Future automatic breakage
recovery techniques should tackle
issues in the CI scripts
Our observations have broader implications for
researchers and tool builders 26
For Research
Community
For Tool Builders
Build outcome noise should be
filtered out before analyses
Heterogeneity should be
considered when training build
outcome prediction models
Automatic breakage recovery
should look beyond tool-specific
insight
Richer information should be
included in build outcome reports
and dashboards
github.com/software-rebels/bbchch
@keheliya

Weitere ähnliche Inhalte

Ähnlich wie Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travis CI

Mining Co-Change Information to Understand when Build Changes are Necessary
Mining Co-Change Information to Understand when Build Changes are NecessaryMining Co-Change Information to Understand when Build Changes are Necessary
Mining Co-Change Information to Understand when Build Changes are NecessaryShane McIntosh
 
5 Ways to Accelerate Standards Compliance with Static Code Analysis
5 Ways to Accelerate Standards Compliance with Static Code Analysis 5 Ways to Accelerate Standards Compliance with Static Code Analysis
5 Ways to Accelerate Standards Compliance with Static Code Analysis Perforce
 
Continuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryContinuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryTimothy Fitz
 
Improving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous IntegrationImproving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous IntegrationKlocwork
 
Flight East 2018 Presentation–Continuous Integration––An Overview
Flight East 2018 Presentation–Continuous Integration––An OverviewFlight East 2018 Presentation–Continuous Integration––An Overview
Flight East 2018 Presentation–Continuous Integration––An OverviewSynopsys Software Integrity Group
 
Modern Release Engineering in a Nutshell - Why Researchers should Care!
Modern Release Engineering in a Nutshell - Why Researchers should Care!Modern Release Engineering in a Nutshell - Why Researchers should Care!
Modern Release Engineering in a Nutshell - Why Researchers should Care!Bram Adams
 
Software engineering
Software engineeringSoftware engineering
Software engineeringbartlowe
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyMike Brittain
 
Intro to CI/CD using Docker
Intro to CI/CD using DockerIntro to CI/CD using Docker
Intro to CI/CD using DockerMichael Irwin
 
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...University of Antwerp
 
Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)
Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)
Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)IT Arena
 
Continuous Integration as a Development Team’s Way of Life
Continuous Integration as a Development Team’s Way of LifeContinuous Integration as a Development Team’s Way of Life
Continuous Integration as a Development Team’s Way of LifeTechWell
 
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...Atlassian
 
Delivery at Scale
Delivery at ScaleDelivery at Scale
Delivery at ScaleAgilar
 
How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis Perforce
 
A Continuous Delivery Safety Net for Databases
A Continuous Delivery Safety Net for DatabasesA Continuous Delivery Safety Net for Databases
A Continuous Delivery Safety Net for DatabasesIBM UrbanCode Products
 
Developer Productivity Engineering with Gradle
Developer Productivity Engineering with GradleDeveloper Productivity Engineering with Gradle
Developer Productivity Engineering with GradleAll Things Open
 

Ähnlich wie Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travis CI (20)

Mining Co-Change Information to Understand when Build Changes are Necessary
Mining Co-Change Information to Understand when Build Changes are NecessaryMining Co-Change Information to Understand when Build Changes are Necessary
Mining Co-Change Information to Understand when Build Changes are Necessary
 
5 Ways to Accelerate Standards Compliance with Static Code Analysis
5 Ways to Accelerate Standards Compliance with Static Code Analysis 5 Ways to Accelerate Standards Compliance with Static Code Analysis
5 Ways to Accelerate Standards Compliance with Static Code Analysis
 
Continuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous DeliveryContinuous Deployment: Beyond Continuous Delivery
Continuous Deployment: Beyond Continuous Delivery
 
Keeping Master Green at Scale
Keeping Master Green at ScaleKeeping Master Green at Scale
Keeping Master Green at Scale
 
Improving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous IntegrationImproving Development Productivity: Static Analysis and Continuous Integration
Improving Development Productivity: Static Analysis and Continuous Integration
 
Flight East 2018 Presentation–Continuous Integration––An Overview
Flight East 2018 Presentation–Continuous Integration––An OverviewFlight East 2018 Presentation–Continuous Integration––An Overview
Flight East 2018 Presentation–Continuous Integration––An Overview
 
Modern Release Engineering in a Nutshell - Why Researchers should Care!
Modern Release Engineering in a Nutshell - Why Researchers should Care!Modern Release Engineering in a Nutshell - Why Researchers should Care!
Modern Release Engineering in a Nutshell - Why Researchers should Care!
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at Etsy
 
Intro to CI/CD using Docker
Intro to CI/CD using DockerIntro to CI/CD using Docker
Intro to CI/CD using Docker
 
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...Keynote VST2020 (Workshop on  Validation, Analysis and Evolution of Software ...
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...
 
Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)
Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)
Microservices: Redundancy = Maintainability! (Eberhard Wolff Technology Stream)
 
ICSE2011_SRC
ICSE2011_SRC ICSE2011_SRC
ICSE2011_SRC
 
Continuous Integration as a Development Team’s Way of Life
Continuous Integration as a Development Team’s Way of LifeContinuous Integration as a Development Team’s Way of Life
Continuous Integration as a Development Team’s Way of Life
 
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
Build it, Test it, Ship it: Continuous Delivery at Turner Broadcasting System...
 
Delivery at Scale
Delivery at ScaleDelivery at Scale
Delivery at Scale
 
Delivery at Scale
Delivery at ScaleDelivery at Scale
Delivery at Scale
 
How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis How To Improve Quality With Static Code Analysis
How To Improve Quality With Static Code Analysis
 
A Continuous Delivery Safety Net for Databases
A Continuous Delivery Safety Net for DatabasesA Continuous Delivery Safety Net for Databases
A Continuous Delivery Safety Net for Databases
 
Developer Productivity Engineering with Gradle
Developer Productivity Engineering with GradleDeveloper Productivity Engineering with Gradle
Developer Productivity Engineering with Gradle
 

Kürzlich hochgeladen

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 

Kürzlich hochgeladen (20)

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 

Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travis CI

  • 1. Noise and Heterogeneity in Historical Build Data: An Empirical Study of Travis CI Keheliya Gallaba Shane McIntoshChristian Macho Martin Pinzger @keheliya keheliya.github.io @Mitschiiii mitschi.github.io @pinzger pinzger.github.io @shane_mcintosh shanemcintosh.org
  • 2. Source Code Automated builds check the impact of changes on the software product Build System Deliverables 2
  • 3. Build outcome data is used to solve software engineering research problems For understanding and predicting build breakage For measuring the build breakage rate For communicating the current build status 3
  • 4. Build outcome data is nuanced! allow_failure enables experimentation with support for a new platform. 4
  • 5. Can the off-the-shelf historical CI build data be trusted? The zdavatz/spreadsheet project has had the allow_failure feature enabled for the entire lifetime of the project! 5
  • 6. Are build outcomes free of noise? Are build outcomes homogeneous? 6
  • 7. We study 680,209 Travis CI builds spanning 1,276 open source projects We follow Mockus' four-step procedure 7
  • 9. We look for passing builds with actively ignored failures 9 680,209 Builds 496,204 Builds 59,904 Builds Select passing builds Select builds with failing jobs Check if the allow_failure property is enabled for the failing jobs in .travis.yml
  • 10. Passing build outcomes do not always indicate that the build was entirely clean 12% of passing builds have an actively ignored failure. Up to 87% of the jobs are actively ignored. 10
  • 11. Passively ignored breakages may introduce noise when all breakages are assumed to be distracting 11 680,209 Builds 610,550 Builds Build filtering Graph construction using version control data Graph analysis Long breakage sequences may mean developers passively ignored failures by not immediately fixing them.
  • 12. In some cases, builds can remain broken for 423 days Overall median length of the failure sequence is five commits. 12
  • 13. One of the reasons for ignoring a build breakage: Staleness 13 Developers may become desensitized to stale* breakages. *If the project has encountered a given breakage in the past it's a stale breakage.
  • 14. 14 Maven Build Log Build fails due to the same reason as a prior failure? Stale Breakage We measure staleness in Maven build breakages Failure details are equal to a prior failure? Not Stale Breakage YES YES NONO Maven Log Analyzer
  • 15. Two of every three build breakages (67%) that we analyze are stale 15
  • 16. We propose Signal-To-Noise Ratio to quantify the proportion of noise 16 Has Ignored Breakages No Ignored Breakages Broken Builds False Build Breakages True Build Breakages Passing Builds False Build Successes True Build Successes SignalNoise
  • 17. One in every 7 to 11 builds (9%-14%) is incorrectly labelled 17
  • 18. Noise may influence analyses based on build outcome data 18 Passing build outcomes do not always indicate that the build was entirely clean Build breakages can persist for up to 485 commits (423 days) 67% of build breakages we analyze are stale 9%-14% of builds are incorrectly labelled
  • 19. Are build outcomes homogeneous? 19 Noise may influence analyses based on build outcome data Passing build outcomes do not always indicate that the build was entirely clean Build breakages can persist for up to 485 commits (423 days) 67% of build breakages we analyze are stale 9%-14% of builds are incorrectly labelled
  • 22. Builds can break for various reasons 22 Compilation Failure Test Failure Dependency Resolution Failure We extend Maven Log Analyzer to parse and classify broken Maven build logs by type Deployment Failure
  • 23. Maven Log Analyzer supports new build breakage categories 23 Ant Inside Maven Goal Failed Broken Outside Maven Run System/Java Program Run Jetty Server Manage Ruby Gems Polyglot for Maven No Log Available Failed Before Maven Travis Aborted Failed After Maven Travis Cancelled
  • 24. Tool-specific breakage is rare. 24 41% of the broken builds failed due to problems outside of Maven.
  • 25. 25 Noise may influence analyses based on build outcome data Passing build outcomes do not always indicate that the build was entirely clean Build breakages can persist for up to 485 commits (423 days) 67% of build breakages we analyze are stale 9%-14% of builds are incorrectly labelled Build outcomes are heterogenous Environment-specific breakage is commonplace Tool-specific breakage is rare Future automatic breakage recovery techniques should tackle issues in the CI scripts
  • 26. Our observations have broader implications for researchers and tool builders 26 For Research Community For Tool Builders Build outcome noise should be filtered out before analyses Heterogeneity should be considered when training build outcome prediction models Automatic breakage recovery should look beyond tool-specific insight Richer information should be included in build outcome reports and dashboards