SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
The (r)evolution of CI/CD on GitHub
Promises and Perils of the GitHub Actions ecosystem
Tom Mens
Software Engineering Lab
March 2023
SECO-ASSIST
secoassist.github.io
2
3
Collaborative software development
4
Commits
Issues
Pull Requests
Comments
Code Reviews
Discussions
Project Management
...
Continuous Integration
Quality
analysis
Build Test Deploy
GitHub
Actions
Examples of CI/CD tools
5
Specifying
GitHub Actions
workflows
6
repository
workflow 3
workflow 2
step 3
job 1
workflow 1
job 2 job 3
workflows
jobs
steps
repository
Parallel
Parallel by default /
sequential
Sequential
.github/workflows/
strategy
step 2
step 1
use: (action) run: (shell cmd) use: (action)
Running workflows
7
GitHub
marketplace
8
Reusing Actions from GitHub MarketPlace
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
9
Empirical Software Engineering (2023) 28:52
https://doi.org/10.1007/s10664-022-10285-5
On the usage, co-usage and migration of CI/CD tools:
A qualitative analysis
Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1
Accepted: 28 December 2022
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
Continuous integration, delivery and deployment (CI/CD) is used to support the collabora-
tive software development process. CI/CD tools automate a wide range of activities in the
development workflow such as testing, linting, updating dependencies, creating and deploy-
ing releases, and so on. Previous quantitative studies have revealed important changes in the
landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many
software projects migrating to other CI/CD tools. In order to understand the reasons behind
these changes in CI/CD usage, this paper presents a qualitative study based on in-depth
interviews with 22 experienced software practitioners reporting on their usage, co-usage and
migration of 31 different CI/CD tools. Following an inductive and deductive coding process,
we analyse the interviews and found a high amount of competition between CI/CD tools. We
observe multiple reasons for co-using different CI/CD tools within the same project, and we
identify the main reasons and detractors for migrating to different alternatives. Among all
reported migrations, we observe a clear trend of migrations away from Travis and migrations
towards GitHub Actions and we identify the main reasons behind them.
Keywords CI/CD · Collaborative software development · Workflow automation ·
Qualitative analysis · Empirical software engineering
Communicated by: Alexander Serebrenik
Alexandre Decan (F.R.S.-FNRS Research Associate)
! Pooya Rostami Mazrae
pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be
Tom Mens
tom.mens@umons.ac.be
Mehdi Golzadeh
golzadeh.mehdi@gmail.com
Alexandre Decan
alexandre.decan@umons.ac.be
1 Software Engineering Lab, Université de Mons, Mons, Belgium
https://doi.org/10.1109/ICSME55016.2022.00029
https://doi.org/10.1109/SANER53432.2022.00084
On the rise and fall of CI services in GitHub
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Abstract—Continuous integration (CI) services are used in
collaborative open source projects to automate parts of the
development workflow. Such services have been in widespread use
for over a decade, with new CIs being introduced over the years,
sometimes overtaking other CIs in popularity. We conducted a
longitudinal empirical study over a period of nine years, aiming
to better understand this rapidly evolving CI landscape. By
analysing the development history of 91,810 GitHub repositories
of active npm packages having used at least one CI service,
we quantitatively studied the evolution of seven popular CIs,
specifically focusing on their co-usage and migration in the
considered repositories. We provide statistical evidence of the rise
of GitHub Actions, that has become the dominant CI service in
less than 18 months time. This coincides with the fall of Travis
that has seen an important decrease in usage, likely due to a
combination of policy changes and migrations to GitHub Actions.
Index Terms—Continuous integration, distributed software
development, software repositories, GitHub
I. INTRODUCTION
Continuous integration (CI), deployment and delivery have
become the cornerstone of collaborative software development
and DevOps practices. CI automates the integration of code
changes from multiple contributors into a central repository
where automated builds, tests and code quality checks run.
Well-known examples of CI services are Jenkins, Travis,
CircleCI and AppVeyor. CI services can also be built-in in
social coding platforms such as GitHub and GitLab [1]. GitLab
already featured CI capabilities since November 2012. Based
on popular demand, and in response to CI support integrated
in GitLab, GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this article) in October 2018. In August 2019, they officially
began supporting Continuous Integration through GHA, and
the product was released publicly in November 2019.
GHA [2] allows to automate a wide range of tasks based
on a variety of triggers such as commits, issues, pull requests,
comments and many more. GHA can be used to facilitate code
reviews, code quality analysis, communication, dependency
and security monitoring and management, testing, etc. GHA
facilitates the integration with external services, and can even
obviate the need of using such external services altogether.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 56 million users in
September 2020 [3]. Given its popularity and the ease with
which GHA allows to automate the CI workflow, we hypoth-
esise that GHA has had a significant impact on today’s CI
landscape. More particularly, we believe that it has increased
the awareness of the need for CI, it has reduced the entry
barrier for projects to start using CI, and it may have lead
projects to migrate from other CI services towards GHA.
This article aims to quantitatively and objectively verify
these hypotheses, and discusses their consequences, through a
longitudinal analysis of how different CIs have been used over
a nine-year period in 91,810 GitHub repositories correspond-
ing to the software development history of reusable Node.JS
packages distributed through the npm package registry. This
empirical study focuses on four research questions:
RQ1 How did the CI landscape evolve? We identified 20
different CIs being used in the considered set of repositories,
some of which were considerably more prevalent than others.
Together with Travis, GHA covers more than 80% of all
usages. Moreover, in only 18 months GHA has overtaken all
other CIs in popularity.
RQ2 What are the most frequent combinations of CIs? We
observed that many repositories have used multiple CIs during
their lifetime. AppVeyor is nearly always used in combination
with some other CI. If a repository uses a CI simultaneously
with another one, it is mostly in combination with Travis,
GHA or CircleCI.
RQ3 How frequently are CIs being replaced by an alternative?
We observed a non-negligible amount of CI migrations. GHA
attracted most of these migrations. The majority of migrations
were moving away from Travis and towards GHA.
RQ4 How has the CI landscape changed since GHA was
introduced? Based on a regression discontinuity design, we
found that the usage of Travis, Azure and CircleCI has been
negatively affected by the introduction of GHA.
This article is structured as follows. Section II motivates the
selected dataset and discusses the data extraction and cleaning
steps that were carried out. Sections III to VI provide answers
to each research question. Section VII discusses the ramifi-
cations of these answers. Section VIII presents the threats to
validity of the conducted research. Section IX presents the
related work. Finally, Section X concludes.
II. DATA EXTRACTION
In order to analyse the use of CIs in software development
repositories on GitHub, we need a large dataset containing
10
https://doi.org/10.1109/SANER53432.2022.00084
Dataset
11
1.6M+
Scoped packages
803K packages
on GitHub
Excluded 11,557
forks
Excluded inactive
repositories
201,403
Repositories
Presence
of CI configuration
files
119,033 CI usages
in
91,810 Repositories
May
2021
Cloned 676K
How prevalent is CI usage
in GitHub repositories?
CI services are used in
more than half of all
considered repositories.
Evolution of GitHub CI/CD landscape
13
Since 2021, GitHub Actions has become
the dominant CI/CD tool in GitHub
Most frequent co-usage of CIs
14
Analysing
CI churn
in the last 3 years
Migrations
between CIs
Migrations
toward GitHub
Actions
Migrations
away from Travis
What happened to Travis?
Travis changed
its free plan
GHA was
introduced
20
Empirical Software Engineering (2023) 28:52
https://doi.org/10.1007/s10664-022-10285-5
On the usage, co-usage and migration of CI/CD tools:
A qualitative analysis
Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1
Accepted: 28 December 2022
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023
Abstract
Continuous integration, delivery and deployment (CI/CD) is used to support the collabora-
tive software development process. CI/CD tools automate a wide range of activities in the
development workflow such as testing, linting, updating dependencies, creating and deploy-
ing releases, and so on. Previous quantitative studies have revealed important changes in the
landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many
software projects migrating to other CI/CD tools. In order to understand the reasons behind
these changes in CI/CD usage, this paper presents a qualitative study based on in-depth
interviews with 22 experienced software practitioners reporting on their usage, co-usage and
migration of 31 different CI/CD tools. Following an inductive and deductive coding process,
we analyse the interviews and found a high amount of competition between CI/CD tools. We
observe multiple reasons for co-using different CI/CD tools within the same project, and we
identify the main reasons and detractors for migrating to different alternatives. Among all
reported migrations, we observe a clear trend of migrations away from Travis and migrations
towards GitHub Actions and we identify the main reasons behind them.
Keywords CI/CD · Collaborative software development · Workflow automation ·
Qualitative analysis · Empirical software engineering
Communicated by: Alexander Serebrenik
Alexandre Decan (F.R.S.-FNRS Research Associate)
! Pooya Rostami Mazrae
pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be
Tom Mens
tom.mens@umons.ac.be
Mehdi Golzadeh
golzadeh.mehdi@gmail.com
Alexandre Decan
alexandre.decan@umons.ac.be
1 Software Engineering Lab, Université de Mons, Mons, Belgium
Methodology
21
• Around 30 questions related to CI usage, co-usage and migration
Interview questionnaire
• Selected candidates through Twitter, LinkedIn, email, direct messages
• Colleagues' referrals (snowballing)
Selection of respondents
• Using online video conferencing tool
Geographic diversity
• Actively contributed to, or having been responsible for a software project relying on CI
• Sufficient knowledge about which CI tool is used in that software project and how
• Having been involved in setting up or maintaining the CI process of the project
Inclusion Criteria
Demographics of respondents
• 22 respondents
• 16 from 7 European countries
• 4 from North America
• 2 from Asia
• software development experience
• average of 12 years and 4 months
• Good mix of industrial and open source
contributors
22
CI/CD tools being used
• 14 additional tools reported only once
• 3 custom-built in-house CI/CD solutions
23
The good ...
25
26
The bad ...
The ugly
27
CI/CD migrations
30
Reasons for
CI migration
31
Why is GitHub Actions so
popular?
• deep integration with GitHub
• ease of setup and use
• trendy
• speed
• reliability
• free tier for open source projects
• large marketplace of reusable Actions
• support for major operating systems
• company support (Microsoft)
• automation beyond CI/CD
33
Difficulties in CI migration
• Learning curve
• Fundamental differences between the
source and target of the migration
• Trial-and-error nature of configuring a
new CI tool
• Lack of familiarity with the new CI tool
• Important missing features
34
On the Use of GitHub Actions in Software
Development Repositories
Alexandre Decan
Software Engineering Lab
University of Mons
Mons, Belgium
alexandre.decan@umons.ac.be
Tom Mens
Software Engineering Lab
University of Mons
Mons, Belgium
tom.mens@umons.ac.be
Pooya Rostami Mazrae
Software Engineering Lab
University of Mons
Mons, Belgium
pooya.rostamimazrae@umons.ac.be
Mehdi Golzadeh
Software Engineering Lab
University of Mons
Mons, Belgium
mehdi.golzadeh@umons.ac.be
Abstract—GitHub Actions was introduced in 2019 and con-
stitutes an integrated alternative to CI/CD services for GitHub
repositories. The deep integration with GitHub allows reposi-
tories to easily automate software development workflows. This
paper empirically studies the use of GitHub Actions on a dataset
comprising 68K repositories on GitHub, of which 43.9% are using
GitHub Actions workflows. We analyse which workflows are
automated and identify the most frequent automation practices.
We show that reuse of actions is a common practice, even if
this reuse is concentrated in a limited number of actions. We
study which actions are most frequently used and how workflows
refer to them. Furthermore, we discuss the related security
and versioning aspects. As such, we provide an overview of
the use of GitHub Actions, constituting a necessary first step
towards a better understanding of this emerging ecosystem and
its implications on collaborative software development in the
GitHub social coding platform.
Index Terms—GitHub Actions, continuous integration, collab-
orative software development, workflow automation
I. INTRODUCTION
Open source software (OSS) development is a continuous,
highly distributed and collaborative endeavour [1]. Develop-
ment of OSS projects faces many socio-technical challenges
[2]–[4]. The multitude of tools (e.g., version control systems,
software distribution managers, bug and issue trackers) and
development-related activities makes it very challenging for
contributor communities to keep up with the rapid pace of
producing and maintaining high-quality software releases.
Automated workflows were introduced to automate numer-
ous repetitive social or technical activities that are inherently
part of the collaborative software development process. Con-
tinuous integration, deployment and delivery (CI/CD) have
become the cornerstone of collaborative software develop-
ment and DevOps practices. Well-known examples of CI/CD
services are Travis, Jenkins, CircleCI and TeamCity. They
automate the integration of code changes from multiple con-
tributors into a central repository where automated builds, tests
and code quality checks run.
GitHub is by far the largest social coding platform, hosting
the development history of millions of collaborative software
repositories, and accommodating over 73 million users in
2021 [5]. GitHub publicly announced the beta version of
GitHub Actions (abbreviated to GHA in the remainder of
this paper) in October 2018 based on popular demand, and in
response to GitLab’s integrated CI/CD support [6]. In August
2019, GitHub officially began supporting CI through GHA,
and the product was released publicly in November 2019.
GHA [7] allows the automation of a wide range of tasks
based on a variety of triggers such as commits, issues, pull
requests, comments, schedules, and many more. Its deep
integration into GitHub implies that GHA can be used not
only for executing test suites or deploying new releases
as in traditional CI/CD services, but also to facilitate code
reviews, communication, dependency and security monitoring
and management, etc. GHA also promotes the use and sharing
of reusable components, called actions, in workflows. These
actions are distributed in public repositories and on the GitHub
Marketplace. They allow workflow developers to easily in-
tegrate specific tasks (e.g., set up a specific programming
language environment, publish a release on a package registry,
run tests and check code quality) without having to write the
corresponding code.
Since its public release in November 2019, GHA has
become the most dominant CI/CD service, only 18 months
after its introduction [8]. Its Marketplace of reusable actions
has been growing exponentially ever since, reaching 12K
reusable actions in February 2022. It is therefore fair to
say that GHA has become a software ecosystem of its own,
comparable to ecosystems of reusable software libraries (such
as npm, RubyGems, CRAN, Maven, and PyPI) that have been
empirically studied by many researchers in recent years (e.g.,
[9]–[14]).
The emerging GHA ecosystem is worthy of being empiri-
cally studied in its own right since it is likely to suffer from
the same issues related to dependency management, security
vulnerabilities, outdated or obsolete components, backward
compatibility, and so on. This article therefore quantitatively
studies the use of GHA in 68K repositories on GitHub. We
analyse which workflows are automated and identify the most
frequent automation practices. We show that reuse of actions
is a common practice and identify which actions are reused
and how. As such, we provide an overview of the use of
GHA, a necessary first step towards a better understanding
of the emerging GHA ecosystem and its implications on
software development in GitHub repositories. More concretely,
we answer the following research questions:
35
https://doi.org/10.1109/ICSME55016.2022.00029
Research
Questions
36
What are the characteristics of repositories using
workflows?
Which kinds of workflows are automated?
What are the most frequent jobs in workflows?
What are the automation practices?
Which types of Actions are reused?
Dataset
• 67,870 repositories
• 4 out of 10 repositories
use GitHub Actions
workflows
• 70,278 workflow files
• 108,500 jobs
• 567,352 steps
37
Quantification of jobs and workflows
Workflows in repositories
single workflow (49.3%)
more than one workflow (50.7%)
Jobs in workflows
single job (77.8%)
more than one job (22.2%)
38
Characteristics of GitHub repositories
using GitHub Actions
Median Effect size
Characteristic With workflows
Without
workflows
Interpretation
Pull Requests 124 41 medium
Contributors 20 11 small
Commits 598 344 small
Issues 105 59 small
40
Repos with GHA workflows tend to have more
contributors, pull requests, commits, and issues
Most frequent event types
triggering workflows
63,4
56,3
16,1 15,4
6,2
8,6
0
10
20
30
40
50
60
70
push PR schedule workflow_dispatch release others
41
DifferDifferent ways of executing codecode
Step type Action target % of steps % of repositories
run: -- 49,9% 93,5%
uses:
Local path 0,8% 2,0%
Docker image 0,1% 1,8%
Same repository 0,2% 0,4%
Same owner 0,7% 4,3%
Other public
repository
48,3% 99,3%
42
Reusing Actions in steps is a common practice
Which Actions are reused?
35,50%
7,20% 6,60% 5,90% 5,80%
98%
22%
26%
19%
21%
0,00%
10,00%
20,00%
30,00%
40,00%
50,00%
60,00%
70,00%
80,00%
90,00%
100,00%
actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python
Top 5 most frequent Actions in steps and repositories
steps repositories 44
• A few Actions concentrate
most of the reuse
• Most of them being
developed by GitHub
45
On the Outdatedness of Workflows
in the GitHub Actions Ecosystem
Alexandre Decan1
, Hassan Onsori Delicheh, Tom Mens
aSoftware Engineering Lab, University of Mons, Mons, Belgium
Abstract
GitHub Actions was introduced as a way to automate CI/CD workflows in
GitHub, the largest social coding platform. Thanks to its deep integration into
GitHub, GitHub Actions can be used to automate a wide range of social and
technical activities. Among its main features, it allows automation workflows
to rely on reusable components – the so-called Actions – to enable developers to
focus on the tasks that should be automated rather than on how to automate
them. As any other kind of reusable software components, Actions are contin-
uously updated, causing many automation workflows to use outdated versions
of these Actions. Based on a dataset of nearly one million workflows obtained
from 22K+ repositories between November 2019 and September 2022, we pro-
vide quantitative empirical evidence that reusing Actions in GitHub workflows
is common practice, even if this reuse tends to concentrate on a limited number
of Actions. We show that Actions are frequently updated, and we quantify to
which extent automation workflows are outdated with respect to these Actions.
Using two complementary metrics, technical lag and opportunity lag, we found
that most of the workflows are using an outdated Action release, are lagging
behind the latest available release for at least 7 months, and had the oppor-
tunity to be updated during at least 9 months. This calls for a more rigorous
management of Action outdatedness in automation workflows, as well as for
better policies and tooling to keep workflows up-to-date.
Keywords: software ecosystem, dependency management, continuous
integration, collaborative software development, workflow automation,
technical lag
Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan),
hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be
(Tom Mens)
1F.R.S.-FNRS Research Associate
Preprint submitted to Journal of Systems & Software March 21, 2023
Outdatedness in the
GitHub Actions ecosystem
46
• Four out of five workflows and nearly
two thirds of the steps are using an
outdated release of an Action.
• Steps using Actions provided by GitHub
are responsible for most of the
outdatedness.
• More than one third of the other steps
and nearly half of the other workflows
are using an outdated release of an
Action.
release of
actions/checkout@v2
release of
actions/checkout@v3
release of
actions/setup-*@v2
release of
actions/setup-*@v3
v1 v2 v3 v4
latest
technical lag
observation
date
GitHub workflow
selected
Action
lifeline
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
Outdatedness in the
GitHub Actions ecosystem
Technical lag of workflows / steps: the time period between the start of
reusing a selected Action and the latest release of that Action.
• Technical lag of outdated steps
tends to increase over time.
• Half of the outdated steps using
other Actions are using a version
that is lagging behind the latest one
for at least 7.3 months.
• Main cause of technical lag =
Actions provided by GitHub
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
v1 v2 v3 v4
opportunity lag
observation
time
GitHub workflow
first update
opportunity
Action
lifeline
selected
Outdatedness in the
GitHub Actions ecosystem
Opportunity lag of workflows / steps: the time period during which a
workflow could have updated an outdated step to a more recent
version of an Action, but didn’t.
• The opportunity lag of outdated steps
tends to increase over time.
• On average, maintainers of outdated
steps have had the opportunity to
update them for 9 months, but have not
done so.
• Main cause of opportunity lag =
Actions provided by GitHub
new releases for
docker/*
Thank you for
your attention.
Any questions?
55

Weitere ähnliche Inhalte

Was ist angesagt?

Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)SocialMediaMining
 
Deadlock in Distributed Systems
Deadlock in Distributed SystemsDeadlock in Distributed Systems
Deadlock in Distributed SystemsPritom Saha Akash
 
Recovery with concurrent transaction
Recovery with concurrent transactionRecovery with concurrent transaction
Recovery with concurrent transactionlavanya marichamy
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networksFrancisco Restivo
 
Mutual Exclusion Election (Distributed computing)
Mutual Exclusion Election (Distributed computing)Mutual Exclusion Election (Distributed computing)
Mutual Exclusion Election (Distributed computing)Sri Prasanna
 
Continuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CIContinuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CIDavid Hahn
 
Intro to Github Actions @likecoin
Intro to Github Actions @likecoinIntro to Github Actions @likecoin
Intro to Github Actions @likecoinWilliam Chong
 
Software Security Engineering
Software Security EngineeringSoftware Security Engineering
Software Security EngineeringMarco Morana
 
Gitlab ci, cncf.sk
Gitlab ci, cncf.skGitlab ci, cncf.sk
Gitlab ci, cncf.skJuraj Hantak
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)SocialMediaMining
 
Domain 6 - Security Assessment and Testing
Domain 6 - Security Assessment and TestingDomain 6 - Security Assessment and Testing
Domain 6 - Security Assessment and TestingMaganathin Veeraragaloo
 
Patch Management Best Practices
Patch Management Best Practices Patch Management Best Practices
Patch Management Best Practices Ivanti
 
Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...
Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...
Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...Simplilearn
 

Was ist angesagt? (20)

Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)Social Media Mining - Chapter 10 (Behavior Analytics)
Social Media Mining - Chapter 10 (Behavior Analytics)
 
Git commands
Git commandsGit commands
Git commands
 
SIEM in NIST Cyber Security Framework
SIEM in NIST Cyber Security FrameworkSIEM in NIST Cyber Security Framework
SIEM in NIST Cyber Security Framework
 
Introduction to Git and Github
Introduction to Git and GithubIntroduction to Git and Github
Introduction to Git and Github
 
Deadlock in Distributed Systems
Deadlock in Distributed SystemsDeadlock in Distributed Systems
Deadlock in Distributed Systems
 
Recovery with concurrent transaction
Recovery with concurrent transactionRecovery with concurrent transaction
Recovery with concurrent transaction
 
Community detection in social networks
Community detection in social networksCommunity detection in social networks
Community detection in social networks
 
Git n git hub
Git n git hubGit n git hub
Git n git hub
 
Mutual Exclusion Election (Distributed computing)
Mutual Exclusion Election (Distributed computing)Mutual Exclusion Election (Distributed computing)
Mutual Exclusion Election (Distributed computing)
 
Basic of SSDLC
Basic of SSDLCBasic of SSDLC
Basic of SSDLC
 
Continuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CIContinuous Integration/Deployment with Gitlab CI
Continuous Integration/Deployment with Gitlab CI
 
Intro to Github Actions @likecoin
Intro to Github Actions @likecoinIntro to Github Actions @likecoin
Intro to Github Actions @likecoin
 
Software Security Engineering
Software Security EngineeringSoftware Security Engineering
Software Security Engineering
 
Gitlab ci, cncf.sk
Gitlab ci, cncf.skGitlab ci, cncf.sk
Gitlab ci, cncf.sk
 
Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)Social Media Mining - Chapter 8 (Influence and Homophily)
Social Media Mining - Chapter 8 (Influence and Homophily)
 
Domain 6 - Security Assessment and Testing
Domain 6 - Security Assessment and TestingDomain 6 - Security Assessment and Testing
Domain 6 - Security Assessment and Testing
 
Git advanced
Git advancedGit advanced
Git advanced
 
Patch Management Best Practices
Patch Management Best Practices Patch Management Best Practices
Patch Management Best Practices
 
Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...
Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...
Git Tutorial For Beginners | What is Git and GitHub? | DevOps Tools | DevOps ...
 
Git hub
Git hubGit hub
Git hub
 

Ähnlich wie The (r)evolution of CI/CD on GitHub

Github Case Study By Amil Ali
Github Case Study By Amil AliGithub Case Study By Amil Ali
Github Case Study By Amil AliAmilAli1
 
GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GrapesTech Solutions
 
Difference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs BitbucketDifference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs Bitbucketjeetendra mandal
 
concordia hacktoberfest.pptx
concordia hacktoberfest.pptxconcordia hacktoberfest.pptx
concordia hacktoberfest.pptxAnkurVerma95745
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubTom Mens
 
What is the concept of GitOps.pdf
What is the concept of GitOps.pdfWhat is the concept of GitOps.pdf
What is the concept of GitOps.pdfCiente
 
why google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorywhy google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorymustafa sarac
 
Why Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryWhy Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryKapil Mohan
 
Get started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxGet started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxKhushiPanwar33
 
Difference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketDifference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketAcodez IT Solutions
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDSunnyvale
 
Github:fi Presentation
Github:fi PresentationGithub:fi Presentation
Github:fi PresentationPTIHPA
 

Ähnlich wie The (r)evolution of CI/CD on GitHub (20)

GitHub.docx
GitHub.docxGitHub.docx
GitHub.docx
 
Github Case Study By Amil Ali
Github Case Study By Amil AliGithub Case Study By Amil Ali
Github Case Study By Amil Ali
 
Git tech
Git techGit tech
Git tech
 
GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?GitHub Vs GitLab | What Are The Major Difference?
GitHub Vs GitLab | What Are The Major Difference?
 
Difference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs BitbucketDifference between Github vs Gitlab vs Bitbucket
Difference between Github vs Gitlab vs Bitbucket
 
concordia hacktoberfest.pptx
concordia hacktoberfest.pptxconcordia hacktoberfest.pptx
concordia hacktoberfest.pptx
 
What is github.
What is github.What is github.
What is github.
 
On the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHubOn the rise and fall of CI services in GitHub
On the rise and fall of CI services in GitHub
 
Git version control
Git version controlGit version control
Git version control
 
What is the concept of GitOps.pdf
What is the concept of GitOps.pdfWhat is the concept of GitOps.pdf
What is the concept of GitOps.pdf
 
Github job support.pptx
Github job support.pptxGithub job support.pptx
Github job support.pptx
 
why google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repositorywhy google stores billions of lines of code in a single repository
why google stores billions of lines of code in a single repository
 
Why Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single RepositoryWhy Google Stores Billions of Lines of Code in a Single Repository
Why Google Stores Billions of Lines of Code in a Single Repository
 
Git hub
Git hubGit hub
Git hub
 
Get started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptxGet started with GitHub Copilot.pptx
Get started with GitHub Copilot.pptx
 
Difference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucketDifference between gitlab vs github vs bitbucket
Difference between gitlab vs github vs bitbucket
 
GITHUB
GITHUBGITHUB
GITHUB
 
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCDKubernetes GitOps featuring GitHub, Kustomize and ArgoCD
Kubernetes GitOps featuring GitHub, Kustomize and ArgoCD
 
GitHub for partners
GitHub for partnersGitHub for partners
GitHub for partners
 
Github:fi Presentation
Github:fi PresentationGithub:fi Presentation
Github:fi Presentation
 

Mehr von Tom Mens

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentTom Mens
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubTom Mens
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureTom Mens
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Tom Mens
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networksTom Mens
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero SpaceTom Mens
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesTom Mens
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Tom Mens
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsTom Mens
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsTom Mens
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarTom Mens
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersTom Mens
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...Tom Mens
 

Mehr von Tom Mens (20)

How to be(come) a successful PhD student
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
 
Recognising bot activity in collaborative software development
Recognising bot activity in collaborative software developmentRecognising bot activity in collaborative software development
Recognising bot activity in collaborative software development
 
A Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHubA Dataset of Bot and Human Activities in GitHub
A Dataset of Bot and Human Activities in GitHub
 
Nurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the FutureNurturing the Software Ecosystems of the Future
Nurturing the Software Ecosystems of the Future
 
Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?Comment programmer un robot en 30 minutes?
Comment programmer un robot en 30 minutes?
 
On backporting practices in package dependency networks
On backporting practices in package dependency networksOn backporting practices in package dependency networks
On backporting practices in package dependency networks
 
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsComparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
Comparing semantic versioning practices in Cargo, npm, Packagist and Rubygems
 
Lost in Zero Space
Lost in Zero SpaceLost in Zero Space
Lost in Zero Space
 
Evaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messagesEvaluating a bot detection model on git commit messages
Evaluating a bot detection model on git commit messages
 
Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!Is my software ecosystem healthy? It depends!
Is my software ecosystem healthy? It depends!
 
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Bot or not? Detecting bots in GitHub pull request activity based on comment s...
Bot or not? Detecting bots in GitHub pull request activity based on comment s...
 
On the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystemsOn the fragility of open source software packaging ecosystems
On the fragility of open source software packaging ecosystems
 
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...How magic is zero? An Empirical Analysis of Initial Development Releases in S...
How magic is zero? An Empirical Analysis of Initial Development Releases in S...
 
Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)Comparing dependency issues across software package distributions (FOSDEM 2020)
Comparing dependency issues across software package distributions (FOSDEM 2020)
 
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)
 
SecoHealth 2019 Research Achievements
SecoHealth 2019 Research AchievementsSecoHealth 2019 Research Achievements
SecoHealth 2019 Research Achievements
 
SECO-Assist 2019 research seminar
SECO-Assist 2019 research seminarSECO-Assist 2019 research seminar
SECO-Assist 2019 research seminar
 
Empirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package ManagersEmpirically Analysing the Socio-Technical Health of Software Package Managers
Empirically Analysing the Socio-Technical Health of Software Package Managers
 
ConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker ContainersConPan: Analysing Packages Installed in Docker Containers
ConPan: Analysing Packages Installed in Docker Containers
 
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...
 

Kürzlich hochgeladen

Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Memoori
 
Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2
Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2
Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2DianaGray10
 
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...BookNet Canada
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationDianaGray10
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfROWELL MARQUINA
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...stockholm university
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Automation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementAutomation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementDianaGray10
 
Introduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptxIntroduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptxmprakaash5
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Software Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerSoftware Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerAnchore
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024BookNet Canada
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfwill854175
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 

Kürzlich hochgeladen (20)

Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!Laying the Data Foundations for Artificial Intelligence!
Laying the Data Foundations for Artificial Intelligence!
 
Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2
Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2
Efficiencies in RPA with UiPath and CyberArk Technologies - Session 2
 
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
Transcript: Green paths: Learning from publishers’ sustainability journeys - ...
 
Women in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automationWomen in Automation 2024: Career session - explore career paths in automation
Women in Automation 2024: Career session - explore career paths in automation
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdfHCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
HCI Lesson 1 - Introduction to Human-Computer Interaction.pdf
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...The work to make the piecework work: An ethnographic study of food delivery w...
The work to make the piecework work: An ethnographic study of food delivery w...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Automation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions managementAutomation Ops Series: Session 3 - Solutions management
Automation Ops Series: Session 3 - Solutions management
 
Introduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptxIntroduction-to-Wazuh-and-its-integration.pptx
Introduction-to-Wazuh-and-its-integration.pptx
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Software Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey HightowerSoftware Security in the Real World w/Kelsey Hightower
Software Security in the Real World w/Kelsey Hightower
 
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
Green paths: Learning from publishers’ sustainability journeys - Tech Forum 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Arti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdfArti Languages Pre Seed Pitchdeck 2024.pdf
Arti Languages Pre Seed Pitchdeck 2024.pdf
 
BoSEU24 | Bill Thompson | Talk From Another Century
BoSEU24 | Bill Thompson | Talk From Another CenturyBoSEU24 | Bill Thompson | Talk From Another Century
BoSEU24 | Bill Thompson | Talk From Another Century
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 

The (r)evolution of CI/CD on GitHub

  • 1. The (r)evolution of CI/CD on GitHub Promises and Perils of the GitHub Actions ecosystem Tom Mens Software Engineering Lab March 2023 SECO-ASSIST secoassist.github.io
  • 2. 2
  • 3. 3
  • 4. Collaborative software development 4 Commits Issues Pull Requests Comments Code Reviews Discussions Project Management ... Continuous Integration Quality analysis Build Test Deploy GitHub Actions
  • 6. Specifying GitHub Actions workflows 6 repository workflow 3 workflow 2 step 3 job 1 workflow 1 job 2 job 3 workflows jobs steps repository Parallel Parallel by default / sequential Sequential .github/workflows/ strategy step 2 step 1 use: (action) run: (shell cmd) use: (action)
  • 9. On the rise and fall of CI services in GitHub Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Abstract—Continuous integration (CI) services are used in collaborative open source projects to automate parts of the development workflow. Such services have been in widespread use for over a decade, with new CIs being introduced over the years, sometimes overtaking other CIs in popularity. We conducted a longitudinal empirical study over a period of nine years, aiming to better understand this rapidly evolving CI landscape. By analysing the development history of 91,810 GitHub repositories of active npm packages having used at least one CI service, we quantitatively studied the evolution of seven popular CIs, specifically focusing on their co-usage and migration in the considered repositories. We provide statistical evidence of the rise of GitHub Actions, that has become the dominant CI service in less than 18 months time. This coincides with the fall of Travis that has seen an important decrease in usage, likely due to a combination of policy changes and migrations to GitHub Actions. Index Terms—Continuous integration, distributed software development, software repositories, GitHub I. INTRODUCTION Continuous integration (CI), deployment and delivery have become the cornerstone of collaborative software development and DevOps practices. CI automates the integration of code changes from multiple contributors into a central repository where automated builds, tests and code quality checks run. Well-known examples of CI services are Jenkins, Travis, CircleCI and AppVeyor. CI services can also be built-in in social coding platforms such as GitHub and GitLab [1]. GitLab already featured CI capabilities since November 2012. Based on popular demand, and in response to CI support integrated in GitLab, GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this article) in October 2018. In August 2019, they officially began supporting Continuous Integration through GHA, and the product was released publicly in November 2019. GHA [2] allows to automate a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments and many more. GHA can be used to facilitate code reviews, code quality analysis, communication, dependency and security monitoring and management, testing, etc. GHA facilitates the integration with external services, and can even obviate the need of using such external services altogether. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 56 million users in September 2020 [3]. Given its popularity and the ease with which GHA allows to automate the CI workflow, we hypoth- esise that GHA has had a significant impact on today’s CI landscape. More particularly, we believe that it has increased the awareness of the need for CI, it has reduced the entry barrier for projects to start using CI, and it may have lead projects to migrate from other CI services towards GHA. This article aims to quantitatively and objectively verify these hypotheses, and discusses their consequences, through a longitudinal analysis of how different CIs have been used over a nine-year period in 91,810 GitHub repositories correspond- ing to the software development history of reusable Node.JS packages distributed through the npm package registry. This empirical study focuses on four research questions: RQ1 How did the CI landscape evolve? We identified 20 different CIs being used in the considered set of repositories, some of which were considerably more prevalent than others. Together with Travis, GHA covers more than 80% of all usages. Moreover, in only 18 months GHA has overtaken all other CIs in popularity. RQ2 What are the most frequent combinations of CIs? We observed that many repositories have used multiple CIs during their lifetime. AppVeyor is nearly always used in combination with some other CI. If a repository uses a CI simultaneously with another one, it is mostly in combination with Travis, GHA or CircleCI. RQ3 How frequently are CIs being replaced by an alternative? We observed a non-negligible amount of CI migrations. GHA attracted most of these migrations. The majority of migrations were moving away from Travis and towards GHA. RQ4 How has the CI landscape changed since GHA was introduced? Based on a regression discontinuity design, we found that the usage of Travis, Azure and CircleCI has been negatively affected by the introduction of GHA. This article is structured as follows. Section II motivates the selected dataset and discusses the data extraction and cleaning steps that were carried out. Sections III to VI provide answers to each research question. Section VII discusses the ramifi- cations of these answers. Section VIII presents the threats to validity of the conducted research. Section IX presents the related work. Finally, Section X concludes. II. DATA EXTRACTION In order to analyse the use of CIs in software development repositories on GitHub, we need a large dataset containing On the Use of GitHub Actions in Software Development Repositories Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Pooya Rostami Mazrae Software Engineering Lab University of Mons Mons, Belgium pooya.rostamimazrae@umons.ac.be Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Abstract—GitHub Actions was introduced in 2019 and con- stitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows reposi- tories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform. Index Terms—GitHub Actions, continuous integration, collab- orative software development, workflow automation I. INTRODUCTION Open source software (OSS) development is a continuous, highly distributed and collaborative endeavour [1]. Develop- ment of OSS projects faces many socio-technical challenges [2]–[4]. The multitude of tools (e.g., version control systems, software distribution managers, bug and issue trackers) and development-related activities makes it very challenging for contributor communities to keep up with the rapid pace of producing and maintaining high-quality software releases. Automated workflows were introduced to automate numer- ous repetitive social or technical activities that are inherently part of the collaborative software development process. Con- tinuous integration, deployment and delivery (CI/CD) have become the cornerstone of collaborative software develop- ment and DevOps practices. Well-known examples of CI/CD services are Travis, Jenkins, CircleCI and TeamCity. They automate the integration of code changes from multiple con- tributors into a central repository where automated builds, tests and code quality checks run. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 73 million users in 2021 [5]. GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this paper) in October 2018 based on popular demand, and in response to GitLab’s integrated CI/CD support [6]. In August 2019, GitHub officially began supporting CI through GHA, and the product was released publicly in November 2019. GHA [7] allows the automation of a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments, schedules, and many more. Its deep integration into GitHub implies that GHA can be used not only for executing test suites or deploying new releases as in traditional CI/CD services, but also to facilitate code reviews, communication, dependency and security monitoring and management, etc. GHA also promotes the use and sharing of reusable components, called actions, in workflows. These actions are distributed in public repositories and on the GitHub Marketplace. They allow workflow developers to easily in- tegrate specific tasks (e.g., set up a specific programming language environment, publish a release on a package registry, run tests and check code quality) without having to write the corresponding code. Since its public release in November 2019, GHA has become the most dominant CI/CD service, only 18 months after its introduction [8]. Its Marketplace of reusable actions has been growing exponentially ever since, reaching 12K reusable actions in February 2022. It is therefore fair to say that GHA has become a software ecosystem of its own, comparable to ecosystems of reusable software libraries (such as npm, RubyGems, CRAN, Maven, and PyPI) that have been empirically studied by many researchers in recent years (e.g., [9]–[14]). The emerging GHA ecosystem is worthy of being empiri- cally studied in its own right since it is likely to suffer from the same issues related to dependency management, security vulnerabilities, outdated or obsolete components, backward compatibility, and so on. This article therefore quantitatively studies the use of GHA in 68K repositories on GitHub. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice and identify which actions are reused and how. As such, we provide an overview of the use of GHA, a necessary first step towards a better understanding of the emerging GHA ecosystem and its implications on software development in GitHub repositories. More concretely, we answer the following research questions: 9 Empirical Software Engineering (2023) 28:52 https://doi.org/10.1007/s10664-022-10285-5 On the usage, co-usage and migration of CI/CD tools: A qualitative analysis Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1 Accepted: 28 December 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 Abstract Continuous integration, delivery and deployment (CI/CD) is used to support the collabora- tive software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploy- ing releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them. Keywords CI/CD · Collaborative software development · Workflow automation · Qualitative analysis · Empirical software engineering Communicated by: Alexander Serebrenik Alexandre Decan (F.R.S.-FNRS Research Associate) ! Pooya Rostami Mazrae pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be Tom Mens tom.mens@umons.ac.be Mehdi Golzadeh golzadeh.mehdi@gmail.com Alexandre Decan alexandre.decan@umons.ac.be 1 Software Engineering Lab, Université de Mons, Mons, Belgium https://doi.org/10.1109/ICSME55016.2022.00029 https://doi.org/10.1109/SANER53432.2022.00084
  • 10. On the rise and fall of CI services in GitHub Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Abstract—Continuous integration (CI) services are used in collaborative open source projects to automate parts of the development workflow. Such services have been in widespread use for over a decade, with new CIs being introduced over the years, sometimes overtaking other CIs in popularity. We conducted a longitudinal empirical study over a period of nine years, aiming to better understand this rapidly evolving CI landscape. By analysing the development history of 91,810 GitHub repositories of active npm packages having used at least one CI service, we quantitatively studied the evolution of seven popular CIs, specifically focusing on their co-usage and migration in the considered repositories. We provide statistical evidence of the rise of GitHub Actions, that has become the dominant CI service in less than 18 months time. This coincides with the fall of Travis that has seen an important decrease in usage, likely due to a combination of policy changes and migrations to GitHub Actions. Index Terms—Continuous integration, distributed software development, software repositories, GitHub I. INTRODUCTION Continuous integration (CI), deployment and delivery have become the cornerstone of collaborative software development and DevOps practices. CI automates the integration of code changes from multiple contributors into a central repository where automated builds, tests and code quality checks run. Well-known examples of CI services are Jenkins, Travis, CircleCI and AppVeyor. CI services can also be built-in in social coding platforms such as GitHub and GitLab [1]. GitLab already featured CI capabilities since November 2012. Based on popular demand, and in response to CI support integrated in GitLab, GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this article) in October 2018. In August 2019, they officially began supporting Continuous Integration through GHA, and the product was released publicly in November 2019. GHA [2] allows to automate a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments and many more. GHA can be used to facilitate code reviews, code quality analysis, communication, dependency and security monitoring and management, testing, etc. GHA facilitates the integration with external services, and can even obviate the need of using such external services altogether. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 56 million users in September 2020 [3]. Given its popularity and the ease with which GHA allows to automate the CI workflow, we hypoth- esise that GHA has had a significant impact on today’s CI landscape. More particularly, we believe that it has increased the awareness of the need for CI, it has reduced the entry barrier for projects to start using CI, and it may have lead projects to migrate from other CI services towards GHA. This article aims to quantitatively and objectively verify these hypotheses, and discusses their consequences, through a longitudinal analysis of how different CIs have been used over a nine-year period in 91,810 GitHub repositories correspond- ing to the software development history of reusable Node.JS packages distributed through the npm package registry. This empirical study focuses on four research questions: RQ1 How did the CI landscape evolve? We identified 20 different CIs being used in the considered set of repositories, some of which were considerably more prevalent than others. Together with Travis, GHA covers more than 80% of all usages. Moreover, in only 18 months GHA has overtaken all other CIs in popularity. RQ2 What are the most frequent combinations of CIs? We observed that many repositories have used multiple CIs during their lifetime. AppVeyor is nearly always used in combination with some other CI. If a repository uses a CI simultaneously with another one, it is mostly in combination with Travis, GHA or CircleCI. RQ3 How frequently are CIs being replaced by an alternative? We observed a non-negligible amount of CI migrations. GHA attracted most of these migrations. The majority of migrations were moving away from Travis and towards GHA. RQ4 How has the CI landscape changed since GHA was introduced? Based on a regression discontinuity design, we found that the usage of Travis, Azure and CircleCI has been negatively affected by the introduction of GHA. This article is structured as follows. Section II motivates the selected dataset and discusses the data extraction and cleaning steps that were carried out. Sections III to VI provide answers to each research question. Section VII discusses the ramifi- cations of these answers. Section VIII presents the threats to validity of the conducted research. Section IX presents the related work. Finally, Section X concludes. II. DATA EXTRACTION In order to analyse the use of CIs in software development repositories on GitHub, we need a large dataset containing 10 https://doi.org/10.1109/SANER53432.2022.00084
  • 11. Dataset 11 1.6M+ Scoped packages 803K packages on GitHub Excluded 11,557 forks Excluded inactive repositories 201,403 Repositories Presence of CI configuration files 119,033 CI usages in 91,810 Repositories May 2021 Cloned 676K
  • 12. How prevalent is CI usage in GitHub repositories? CI services are used in more than half of all considered repositories.
  • 13. Evolution of GitHub CI/CD landscape 13 Since 2021, GitHub Actions has become the dominant CI/CD tool in GitHub
  • 19. What happened to Travis? Travis changed its free plan GHA was introduced
  • 20. 20 Empirical Software Engineering (2023) 28:52 https://doi.org/10.1007/s10664-022-10285-5 On the usage, co-usage and migration of CI/CD tools: A qualitative analysis Pooya Rostami Mazrae1 · Tom Mens1 · Mehdi Golzadeh1 · Alexandre Decan1 Accepted: 28 December 2022 © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023 Abstract Continuous integration, delivery and deployment (CI/CD) is used to support the collabora- tive software development process. CI/CD tools automate a wide range of activities in the development workflow such as testing, linting, updating dependencies, creating and deploy- ing releases, and so on. Previous quantitative studies have revealed important changes in the landscape of CI/CD usage, with the increasing popularity of cloud-based services, and many software projects migrating to other CI/CD tools. In order to understand the reasons behind these changes in CI/CD usage, this paper presents a qualitative study based on in-depth interviews with 22 experienced software practitioners reporting on their usage, co-usage and migration of 31 different CI/CD tools. Following an inductive and deductive coding process, we analyse the interviews and found a high amount of competition between CI/CD tools. We observe multiple reasons for co-using different CI/CD tools within the same project, and we identify the main reasons and detractors for migrating to different alternatives. Among all reported migrations, we observe a clear trend of migrations away from Travis and migrations towards GitHub Actions and we identify the main reasons behind them. Keywords CI/CD · Collaborative software development · Workflow automation · Qualitative analysis · Empirical software engineering Communicated by: Alexander Serebrenik Alexandre Decan (F.R.S.-FNRS Research Associate) ! Pooya Rostami Mazrae pooya.rostami.m@gmail.com; pooya.rostamimazrae@umons.ac.be Tom Mens tom.mens@umons.ac.be Mehdi Golzadeh golzadeh.mehdi@gmail.com Alexandre Decan alexandre.decan@umons.ac.be 1 Software Engineering Lab, Université de Mons, Mons, Belgium
  • 21. Methodology 21 • Around 30 questions related to CI usage, co-usage and migration Interview questionnaire • Selected candidates through Twitter, LinkedIn, email, direct messages • Colleagues' referrals (snowballing) Selection of respondents • Using online video conferencing tool Geographic diversity • Actively contributed to, or having been responsible for a software project relying on CI • Sufficient knowledge about which CI tool is used in that software project and how • Having been involved in setting up or maintaining the CI process of the project Inclusion Criteria
  • 22. Demographics of respondents • 22 respondents • 16 from 7 European countries • 4 from North America • 2 from Asia • software development experience • average of 12 years and 4 months • Good mix of industrial and open source contributors 22
  • 23. CI/CD tools being used • 14 additional tools reported only once • 3 custom-built in-house CI/CD solutions 23
  • 29. Why is GitHub Actions so popular? • deep integration with GitHub • ease of setup and use • trendy • speed • reliability • free tier for open source projects • large marketplace of reusable Actions • support for major operating systems • company support (Microsoft) • automation beyond CI/CD 33
  • 30. Difficulties in CI migration • Learning curve • Fundamental differences between the source and target of the migration • Trial-and-error nature of configuring a new CI tool • Lack of familiarity with the new CI tool • Important missing features 34
  • 31. On the Use of GitHub Actions in Software Development Repositories Alexandre Decan Software Engineering Lab University of Mons Mons, Belgium alexandre.decan@umons.ac.be Tom Mens Software Engineering Lab University of Mons Mons, Belgium tom.mens@umons.ac.be Pooya Rostami Mazrae Software Engineering Lab University of Mons Mons, Belgium pooya.rostamimazrae@umons.ac.be Mehdi Golzadeh Software Engineering Lab University of Mons Mons, Belgium mehdi.golzadeh@umons.ac.be Abstract—GitHub Actions was introduced in 2019 and con- stitutes an integrated alternative to CI/CD services for GitHub repositories. The deep integration with GitHub allows reposi- tories to easily automate software development workflows. This paper empirically studies the use of GitHub Actions on a dataset comprising 68K repositories on GitHub, of which 43.9% are using GitHub Actions workflows. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice, even if this reuse is concentrated in a limited number of actions. We study which actions are most frequently used and how workflows refer to them. Furthermore, we discuss the related security and versioning aspects. As such, we provide an overview of the use of GitHub Actions, constituting a necessary first step towards a better understanding of this emerging ecosystem and its implications on collaborative software development in the GitHub social coding platform. Index Terms—GitHub Actions, continuous integration, collab- orative software development, workflow automation I. INTRODUCTION Open source software (OSS) development is a continuous, highly distributed and collaborative endeavour [1]. Develop- ment of OSS projects faces many socio-technical challenges [2]–[4]. The multitude of tools (e.g., version control systems, software distribution managers, bug and issue trackers) and development-related activities makes it very challenging for contributor communities to keep up with the rapid pace of producing and maintaining high-quality software releases. Automated workflows were introduced to automate numer- ous repetitive social or technical activities that are inherently part of the collaborative software development process. Con- tinuous integration, deployment and delivery (CI/CD) have become the cornerstone of collaborative software develop- ment and DevOps practices. Well-known examples of CI/CD services are Travis, Jenkins, CircleCI and TeamCity. They automate the integration of code changes from multiple con- tributors into a central repository where automated builds, tests and code quality checks run. GitHub is by far the largest social coding platform, hosting the development history of millions of collaborative software repositories, and accommodating over 73 million users in 2021 [5]. GitHub publicly announced the beta version of GitHub Actions (abbreviated to GHA in the remainder of this paper) in October 2018 based on popular demand, and in response to GitLab’s integrated CI/CD support [6]. In August 2019, GitHub officially began supporting CI through GHA, and the product was released publicly in November 2019. GHA [7] allows the automation of a wide range of tasks based on a variety of triggers such as commits, issues, pull requests, comments, schedules, and many more. Its deep integration into GitHub implies that GHA can be used not only for executing test suites or deploying new releases as in traditional CI/CD services, but also to facilitate code reviews, communication, dependency and security monitoring and management, etc. GHA also promotes the use and sharing of reusable components, called actions, in workflows. These actions are distributed in public repositories and on the GitHub Marketplace. They allow workflow developers to easily in- tegrate specific tasks (e.g., set up a specific programming language environment, publish a release on a package registry, run tests and check code quality) without having to write the corresponding code. Since its public release in November 2019, GHA has become the most dominant CI/CD service, only 18 months after its introduction [8]. Its Marketplace of reusable actions has been growing exponentially ever since, reaching 12K reusable actions in February 2022. It is therefore fair to say that GHA has become a software ecosystem of its own, comparable to ecosystems of reusable software libraries (such as npm, RubyGems, CRAN, Maven, and PyPI) that have been empirically studied by many researchers in recent years (e.g., [9]–[14]). The emerging GHA ecosystem is worthy of being empiri- cally studied in its own right since it is likely to suffer from the same issues related to dependency management, security vulnerabilities, outdated or obsolete components, backward compatibility, and so on. This article therefore quantitatively studies the use of GHA in 68K repositories on GitHub. We analyse which workflows are automated and identify the most frequent automation practices. We show that reuse of actions is a common practice and identify which actions are reused and how. As such, we provide an overview of the use of GHA, a necessary first step towards a better understanding of the emerging GHA ecosystem and its implications on software development in GitHub repositories. More concretely, we answer the following research questions: 35 https://doi.org/10.1109/ICSME55016.2022.00029
  • 32. Research Questions 36 What are the characteristics of repositories using workflows? Which kinds of workflows are automated? What are the most frequent jobs in workflows? What are the automation practices? Which types of Actions are reused?
  • 33. Dataset • 67,870 repositories • 4 out of 10 repositories use GitHub Actions workflows • 70,278 workflow files • 108,500 jobs • 567,352 steps 37
  • 34. Quantification of jobs and workflows Workflows in repositories single workflow (49.3%) more than one workflow (50.7%) Jobs in workflows single job (77.8%) more than one job (22.2%) 38
  • 35. Characteristics of GitHub repositories using GitHub Actions Median Effect size Characteristic With workflows Without workflows Interpretation Pull Requests 124 41 medium Contributors 20 11 small Commits 598 344 small Issues 105 59 small 40 Repos with GHA workflows tend to have more contributors, pull requests, commits, and issues
  • 36. Most frequent event types triggering workflows 63,4 56,3 16,1 15,4 6,2 8,6 0 10 20 30 40 50 60 70 push PR schedule workflow_dispatch release others 41
  • 37. DifferDifferent ways of executing codecode Step type Action target % of steps % of repositories run: -- 49,9% 93,5% uses: Local path 0,8% 2,0% Docker image 0,1% 1,8% Same repository 0,2% 0,4% Same owner 0,7% 4,3% Other public repository 48,3% 99,3% 42 Reusing Actions in steps is a common practice
  • 38. Which Actions are reused? 35,50% 7,20% 6,60% 5,90% 5,80% 98% 22% 26% 19% 21% 0,00% 10,00% 20,00% 30,00% 40,00% 50,00% 60,00% 70,00% 80,00% 90,00% 100,00% actions/checkout actions/cache actions/setup-node actions/upload-artifact actions/setup-python Top 5 most frequent Actions in steps and repositories steps repositories 44 • A few Actions concentrate most of the reuse • Most of them being developed by GitHub
  • 39. 45 On the Outdatedness of Workflows in the GitHub Actions Ecosystem Alexandre Decan1 , Hassan Onsori Delicheh, Tom Mens aSoftware Engineering Lab, University of Mons, Mons, Belgium Abstract GitHub Actions was introduced as a way to automate CI/CD workflows in GitHub, the largest social coding platform. Thanks to its deep integration into GitHub, GitHub Actions can be used to automate a wide range of social and technical activities. Among its main features, it allows automation workflows to rely on reusable components – the so-called Actions – to enable developers to focus on the tasks that should be automated rather than on how to automate them. As any other kind of reusable software components, Actions are contin- uously updated, causing many automation workflows to use outdated versions of these Actions. Based on a dataset of nearly one million workflows obtained from 22K+ repositories between November 2019 and September 2022, we pro- vide quantitative empirical evidence that reusing Actions in GitHub workflows is common practice, even if this reuse tends to concentrate on a limited number of Actions. We show that Actions are frequently updated, and we quantify to which extent automation workflows are outdated with respect to these Actions. Using two complementary metrics, technical lag and opportunity lag, we found that most of the workflows are using an outdated Action release, are lagging behind the latest available release for at least 7 months, and had the oppor- tunity to be updated during at least 9 months. This calls for a more rigorous management of Action outdatedness in automation workflows, as well as for better policies and tooling to keep workflows up-to-date. Keywords: software ecosystem, dependency management, continuous integration, collaborative software development, workflow automation, technical lag Email addresses: alexandre.decan@umons.ac.be (Alexandre Decan), hassan.onsoridelicheh@umons.ac.be (Hassan Onsori Delicheh), tom.mens@umons.ac.be (Tom Mens) 1F.R.S.-FNRS Research Associate Preprint submitted to Journal of Systems & Software March 21, 2023
  • 40. Outdatedness in the GitHub Actions ecosystem 46 • Four out of five workflows and nearly two thirds of the steps are using an outdated release of an Action. • Steps using Actions provided by GitHub are responsible for most of the outdatedness. • More than one third of the other steps and nearly half of the other workflows are using an outdated release of an Action. release of actions/checkout@v2 release of actions/checkout@v3 release of actions/setup-*@v2 release of actions/setup-*@v3
  • 41. v1 v2 v3 v4 latest technical lag observation date GitHub workflow selected Action lifeline Outdatedness in the GitHub Actions ecosystem Technical lag of workflows / steps: the time period between the start of reusing a selected Action and the latest release of that Action.
  • 42. Outdatedness in the GitHub Actions ecosystem Technical lag of workflows / steps: the time period between the start of reusing a selected Action and the latest release of that Action. • Technical lag of outdated steps tends to increase over time. • Half of the outdated steps using other Actions are using a version that is lagging behind the latest one for at least 7.3 months. • Main cause of technical lag = Actions provided by GitHub
  • 43. Outdatedness in the GitHub Actions ecosystem Opportunity lag of workflows / steps: the time period during which a workflow could have updated an outdated step to a more recent version of an Action, but didn’t. v1 v2 v3 v4 opportunity lag observation time GitHub workflow first update opportunity Action lifeline selected
  • 44. Outdatedness in the GitHub Actions ecosystem Opportunity lag of workflows / steps: the time period during which a workflow could have updated an outdated step to a more recent version of an Action, but didn’t. • The opportunity lag of outdated steps tends to increase over time. • On average, maintainers of outdated steps have had the opportunity to update them for 9 months, but have not done so. • Main cause of opportunity lag = Actions provided by GitHub new releases for docker/*
  • 45. Thank you for your attention. Any questions? 55