SlideShare ist ein Scribd-Unternehmen logo
1 von 101
Downloaden Sie, um offline zu lesen
The adoption of FOSS workfows in
commercial software development: the
case of git and github
Daniel M German
University of Victoria
Canada
Open Source is everywhere
On SSL and Heartbleed
“[Heartbleed] is a software faw that
has left up to two-thirds of the
world’s websites vulnerable to
attack by hackers.”
– The Economist
“There is no such thing as bad publicity except
your own obituary.”
– Brendan Behan
● “Most open-source software – and Open SSL is
no exception – is produced voluntarily by
people who are not paid for creating it. They
do it for love, professional pride or as a way of
demonstrating technical virtuosity. And mostly
they do it in their spare time.”
– John Naughton The Observer/The Guardian
'Heartbleed' bug can't be simply blamed on coders,
April 13, 2014
“Responsible corporate use of open-source
software should therefore involve some
measure of reciprocity: a corporation that
benefts hugely from such software ought to
put something back, either in the form of
fnancial support for a particular open-
source project, or – better still – by
encouraging its own software people to
contribute to the project.”
“Much of the invisible backbone of
websites from Google to Amazon to the
Federal Bureau of Investigation was built
by volunteer programmers in what is
known as the open-source community.”
“... volunteers, connected over the
Internet, work together to build free
software, to maintain and improve it and
to look for bugs. Ideally, they check one
another’s work in a peer review system
similar to that found in science.”
Linus Law:
“Given Enough Eyeballs, all Bugs are
Shallow”
Eric Raymond, The Cathedral and the Bazaar
In the case of Heartbleed
“There weren't enough eyeballs”
- Eric Raymond,
â—Ź Code was created by a grad student
â—Ź Reviewed by S. Henson, core developer
of OpenSSL
â—Ź Included in OpenSSL in the Spring 2011
â—Ź Not discovered for 3 years!
Budget of openSSL:
– US$2,000 for 2013
the OpenSSL problem
â—Ź important infrastructure projects that are
run by small teams of volunteers
â—Ź on April 24, the Linux Foundation
announces the “Core Infrastructure
Initiative” to address it
Core Infrastructure Initiative
â—Ź Funded by:
– Amazon, Cisco, Dell, Facebook, Fujitsu, Google,
IBM, Intel, Microsoft, NetApp,
Rackspace,Qualcomm, VMware and The Linux
Foundation
â—Ź Funding to core projects:
– Fellowships to core developers
– as well as other resources to assist the project in
improving its security, enabling outside reviews,
and improving responsiveness to patch requests.
What is FOSS development?
â—Ź Most important feature of FOSS
– its free or open source license
â—Ź License
– Guarantees code is available to others to
reuse
– Becomes a social contract
among participants
What is OSS development?
â—Ź Most frequently defned as:
– Self organized teams developing software
without a central authority
â—Ź Code is open for review
– and reuse!!!
â—Ź Anybody can participate
What makes OSS development
possible?
â—Ź Teams of self-organized developers
and contributors
â—Ź The Internet
â—Ź A common toolkit
â—Ź Version control systems
Teams
â—Ź Come from all sectors:
– Professionals and hobbyists
– Paid and volunteers
– Novices and Experienced
– High-school students to PhDs
– All over the world!!!
â—Ź Highly motivated!
Common Toolkit
â—Ź To be able to collaborate you need a common set of tools
– Programming languages
â—Ź gcc, perl, python, java, ruby, lua, php...
– Editors and IDEs
â—Ź Emacs, vim, Eclipse, Netbeans...
– Libraries
â—Ź boost, maven, cpan, Pypi...
– Infrastructure
â—Ź Make, ant, cmake, bugzilla, etc.
– Hosting infrastructure
â—Ź Sourceforge, Google Code, github, bitbucket
â—Ź They must be available at zero cost to anybody
FOSS Toolkit
â—Ź I posit that one of the biggest infuences
of FOSS on the practice of Software
Development is the wide use of FOSS
tools for the development of software
– Most implementations of popular
programming languages today are open
source
– FOSS Editors and IDEs are
widely used too
Free Software Foundation
â—Ź The FSF had to boostrap the development
of the OSS toolkit
– To build an Operating System you need a
compiler
– Before you build a compiler you need an
editor, but you need an editor to build a
compiler
– gcc, emacs, bintools (ls, echo, cat, etc.), etc
Richard Stallman
Created the legal and technical
infrastructure for Free and Open Source
software
on Code Reviews
Need for Code Reviews
â—Ź Many FOSS teams discovered that to
ship good quality software they needed
to review the source code
Fagan Code Inspections
â—Ź Code reviews performed at specifc stages of
development
Effective, but not widely used
Open Source style Code Reviews
â—Ź Fagan inspections were unfeasible
– Required participants to be in the same room
â—Ź Instead, code reviews started to be
incremental
– Rather than reviewing the whole, review the
delta (the patch)
Code Reviews in FOSS
the spectrum of Code Reviews
code reviews in FOSS
(1) early, frequent reviews
(2) of small, independent, complete
contributions
(3) that are broadcast to a large group of
stakeholders, but only reviewed by a small
set of self-selected experts
(4) resulting in an effcient and effective peer
review technique.
- Peter Rigby
Lessons from FOSS
on Version Control systems
Version Control Systems
â—Ź At the beginning, FOSS used tar fles in
USENET
– the FSF would ship physical tapes!
â—Ź Today, version control systems are the norm
– Centralized or Distributed
â—Ź FOSS has a continuous and proven track of
innovation in version control systems
– FOSS democratized VC
On Version Control
â—Ź The VC is the
circulatory system of a
software development
â—Ź It brings the code to all
stakeholders
â—Ź A contribution is a patch
– one or more commits
the patch
â—Ź the patch should be reviewed
â—Ź most VCs don't support reviewing of
patches
the patch and its review
â—Ź Two models:
– Commit then Review
â—Ź Review the code after it has been integrated
or
– Review Then Commit (RTC)
â—Ź Review the patch before it is integrated
Linux
â—Ź Linux incorporated RTC early in its
process
â—Ź Linus needed integration of Review
process with VC
â—Ź No FOSS VC did it
– he turned to bitkeeper
Bitkeeper and Linux
â—Ź Symbiotic relationship
– Free (as in beer) licenses to linux developers with one
big condition
â—Ź User should not develop competing tools
– Bitkeeper rapidly improved Linux integration process
â—Ź simplifed integration of reviewed code
– Bitkeeper was probably infuenced by Linus workfow
– in 2005 bitkeeper revokes its license to Linux
developers
Git
â—Ź Many other distributed version control
systems before it
â—Ź What makes it special?
– Many features, but specially:
â—Ź Pull-requests
â—Ź git incorporates code review process with a
distributed version control system
– Even via email patches
How is distributed version control
software being used?
Git
â—Ź Software engineers are moving towards
git
– And other DVCs
â—Ź Github a major reason
The Promise of Git
From: http://thkoch2001.github.io/whygitisbetter/
Challenge 1
â—Ź Personal repos are beyond reach
â—Ź Local commits might never be observable
“History is written by the
victors”
Challenge 2: History
Rebasing changes history
Save history before it is lost!
Super-repository
â—Ź Collection of repositories cloned
(recursively) from the same repo
– At least one per developer
â—Ź In their personal computer
– At least one public repository
â—Ź The blessed
– In git, no way to trace them
Moving commits across the
superRepo
Method
Push Done at source, needs write access to destination
Pull Done at destination, needs read access to source
Email Source creates patch mails it; recipient applies it
Ecosystem of Repos
Can we learn from Linux?
Life of a Patch in Linux
ContinuousMining of Linux
â—Ź Linux has no centralized logging
– Nobody really knows what the superRepo is
– Commits fow without any event
broadcasting mechanism
â—Ź Who do we fnd the activity?
– Repos
– Commits
Semiautomatic Process
â—Ź Every 3 hrs, ask every repo
– What new commits do you have?
– What commits did you delete?
– Automatically resolve propagations
â—Ź Commits might propagate before we scan
â—Ź Daily:
– Are commits in repo by unknown committers?
â—Ź Answer:
– is there a new repo? or is committer new to repo?
Implementation
â—Ź Running since Nov. 2011
– Currently scans 650 repos every 3 hrs
– Retrieved
â—Ź 2.3 million commits (compared to 400k in Linus
repo)
â—Ź 109 million records in propagation table
<commit-id, added|deleted, repo, when>
Snapshot (Linus) Continuous
No Repos 1 479
Commits 64k 533k
Non-merge Commits 59k 485k
Unique Non-merges 58k 135k
%unique non-merges 98.9% 27.9%
Non-merges that reached Blessed 43.1%
Different authors emails 3434 5646
Different authors 2883 4575
Different committers emails 283 1185
Different committers 245 1058
Commit vs Patches
â—Ź Commit ids are insuffcient to tracks patches
â—Ź Large amount of work not reaching blessed
Arrival of Commits at Blessed
Arrival of Commits at Blessed...
â—Ź We can classify patches as a new feature
or bug-fx
The Latency
Time of Authorship Time of Commit
The Repos
Path to Linus
â—Ź Large ecosystem of
repositories
– Producers
– Consumers
Contributors vs Consumers
Linux Dashboard
â—Ź We asked two linux maintainers:
– Can this info be useful?
â—Ź Answer:
– “Yes”
… but not for what we expected...
Tracking commits in Linux
â—Ź Need to track patches, not commits
– Particularly important in consumer
repositories
– Need to cross-reference commits
â—Ź What commits contain the same patch?
– Some repos track commits from blessed via
cherry-picking
â—Ź Commit ids are useless
â—Ź So they annotate log with the origin commit id
Linux Commits Dashboard
â—Ź Where is my commit?
– My original commit, has it reached Linus?
â—Ź What was merged?
– What commits were merged at once by Linus?
â—Ź What commits are related to this one?
– Same patch
â—Ź Rebasing
â—Ź Cherry picking
– Mentioned in a commit
â—Ź This commit fxes bug introduced in X
â—Ź This commit reverts commit X
â—Ź http://o.cs.uvic.ca:20810/perl/cid.pl?cid=70cb8bb0d365f0bc8b20fa67347caf9598a4674e
â—Ź
Researcher states:
“40% of pull requests are not merged”
â—Ź Based on simply querying ghtorrent data
â—Ź But it ignores what really happens
â—Ź Many pull requests are merged without being
marked as merged in github
â—Ź Ghtorrent data has many potential threats to
validity
What is github used for?
"I store my presentations in github. I don't
need a USB stick anymore!"
Are there potential threats to validity for studies
that assume github is about software engineering
only?
Methodology
â—Ź Data sources:
– Surveys
– Sampling of repositories
â—Ź Mixed methods:
– Quantitative, and
– Qualitative
I. A repository is not necessarily a project
II. Most projects have few commits
III. Most projects are innactive
IV. A large proportion of repositories are not for software
engineering
V. More than two thirds of projects are personal
VI. Only a fraction of repos use pull requests
VII. If the commits in a pull-request are reworked, github only
records the resulting patch
VIII.Most pull-requests appear as non-merged, even though
they were merged
IX. Many active projects do not conduct all their sotfware
development activity in github
Uses:
Most projects are inactive
Social?
67% of projects are personal repos
95% have 3 or less committers
Self contained?
“Any serious project would have to have some
separate infrastructure - mailing lists, forums, irc
channels and their archives, build farms, etc. [...]
Thus while GitHub and all other project hosts are
used for collaboration, they are not and can not
be a complete solution.”
Others are already using github's
information to reach conclusions!
the open source report card
http://osrc.dfm.io/dmgerman/
how are github users collaborating?
How does github suppot
collaboration?
â—Ź Methodology:
– Survey
â—Ź 240 responses (24% response rate)
– Interviews
â—Ź 35 interviews from survey respondents
– 71% professional developers
– 11% managers
– 9% students
– 9% interns
â—Ź Approximately 1hr each
Survey: why do you use github?
Code centric collaboration
Themes: focus
â—Ź Simple tools
– git branching/merging
– github features seem to be enough for most
â—Ź Pull requests and issue tracking
â—Ź Focused interaction
– code-centric, focused communication
– asynchronous and unobtrusive
â—Ź
Focus: independence
â—Ź Decentralized work:
– git allows them to work independently
– yet they have visibility of what others do
â—Ź Low need for management:
– Need for a clear process (the workfow)
– They shy away from rigid management and team
structure
– Team managers recognize this
– Managers should be educated on using git/github
Focus: Exposure
â—Ź Easy contribution process
– Fork and potentially contribute without pre-
authorization
â—Ź Peer pressure
– Developers are conscious that their code is
readily visible to others
– Adoption of small, frequent contributions
OSS mentality
â—Ź At the operational level
– the nature of the work allows independence and self-
organization.
– developers are familiar with the idea of working this way and
share the mentality behind it.
â—Ź developers are self-driven
â—Ź share the mentality of
– self- organizing,
– minimizing communication and coordination needs,
– having ownership of code, and
– operating on a meritocratic, expertise-based model
The github ecosystem
The Github Ecosystem
â—Ź github is creating an ecosystem of
proprietary, cloud enabled applications
for software development teams
– Service integration
– JSON API
â—Ź Asana, Campfre, Lighthouse, Jira, Travis,
Trello, etc, etc.
Conclusions
â—Ź git and github are promoting the use of the pull-
request workfow
– small, independent contributions
– that can be reviewed before integration
â—Ź Effectively, adopting open source code practices
into their development
– Independent work
– Code reviews of contributions before they are
integrated

Weitere ähnliche Inhalte

Was ist angesagt?

Lcu14 312-Introduction to the Ecosystem day
Lcu14 312-Introduction to the Ecosystem day Lcu14 312-Introduction to the Ecosystem day
Lcu14 312-Introduction to the Ecosystem day Linaro
 
LCA13: Upstreaming 101
LCA13: Upstreaming 101LCA13: Upstreaming 101
LCA13: Upstreaming 101Linaro
 
How do Centralized and Distributed Version Control Systems Impact Software Ch...
How do Centralized and Distributed Version Control Systems Impact Software Ch...How do Centralized and Distributed Version Control Systems Impact Software Ch...
How do Centralized and Distributed Version Control Systems Impact Software Ch...Caius Brindescu
 
Introduction to License Compliance and My research (D. German)
Introduction to License Compliance and My research (D. German)Introduction to License Compliance and My research (D. German)
Introduction to License Compliance and My research (D. German)dmgerman
 
LCA13: Why I Don't Want Your Code
LCA13: Why I Don't Want Your CodeLCA13: Why I Don't Want Your Code
LCA13: Why I Don't Want Your CodeLinaro
 
ESE 2010: Using Git in Eclipse
ESE 2010: Using Git in EclipseESE 2010: Using Git in Eclipse
ESE 2010: Using Git in EclipseChris Aniszczyk
 
Version Control Systems -- Git -- Part I
Version Control Systems -- Git -- Part IVersion Control Systems -- Git -- Part I
Version Control Systems -- Git -- Part ISergey Aganezov
 
Git Tutorial
Git Tutorial Git Tutorial
Git Tutorial Ahmed Taha
 
12 tricks to avoid hackers breaks your CI / CD
12 tricks to avoid hackers breaks your  CI / CD12 tricks to avoid hackers breaks your  CI / CD
12 tricks to avoid hackers breaks your CI / CDDaniel Garcia (a.k.a cr0hn)
 
Effective Git with Eclipse
Effective Git with EclipseEffective Git with Eclipse
Effective Git with EclipseChris Aniszczyk
 
Git in gear: How to track changes, travel back in time, and code nicely with ...
Git in gear: How to track changes, travel back in time, and code nicely with ...Git in gear: How to track changes, travel back in time, and code nicely with ...
Git in gear: How to track changes, travel back in time, and code nicely with ...fureigh
 
Rooted con 2020 - from the heaven to hell in the CI - CD
Rooted con 2020 - from the heaven to hell in the CI - CDRooted con 2020 - from the heaven to hell in the CI - CD
Rooted con 2020 - from the heaven to hell in the CI - CDDaniel Garcia (a.k.a cr0hn)
 
FTP Commando to Git Hero - WordCamp Denver 2013
FTP Commando to Git Hero - WordCamp Denver 2013FTP Commando to Git Hero - WordCamp Denver 2013
FTP Commando to Git Hero - WordCamp Denver 2013Jeremy Green
 
EclipseCon 2010 talk: Towards contributors heaven
EclipseCon 2010 talk: Towards contributors heavenEclipseCon 2010 talk: Towards contributors heaven
EclipseCon 2010 talk: Towards contributors heavenmsohn
 
Using Git Inside Eclipse, Pushing/Cloning from GitHub
Using Git Inside Eclipse, Pushing/Cloning from GitHubUsing Git Inside Eclipse, Pushing/Cloning from GitHub
Using Git Inside Eclipse, Pushing/Cloning from GitHubAboutHydrology Slides
 
JenkinsPy workshop
JenkinsPy workshop JenkinsPy workshop
JenkinsPy workshop Haifa Ftirich
 
EclipseCon 2010 tutorial: Understanding git at Eclipse
EclipseCon 2010 tutorial: Understanding git at EclipseEclipseCon 2010 tutorial: Understanding git at Eclipse
EclipseCon 2010 tutorial: Understanding git at Eclipsemsohn
 
Chapter 8 security tools ii
Chapter 8   security tools iiChapter 8   security tools ii
Chapter 8 security tools iiSyaiful Ahdan
 
Linker namespace upload
Linker namespace   uploadLinker namespace   upload
Linker namespace uploadBin Yang
 
Understanding and Using Git at Eclipse
Understanding and Using Git at EclipseUnderstanding and Using Git at Eclipse
Understanding and Using Git at EclipseChris Aniszczyk
 

Was ist angesagt? (20)

Lcu14 312-Introduction to the Ecosystem day
Lcu14 312-Introduction to the Ecosystem day Lcu14 312-Introduction to the Ecosystem day
Lcu14 312-Introduction to the Ecosystem day
 
LCA13: Upstreaming 101
LCA13: Upstreaming 101LCA13: Upstreaming 101
LCA13: Upstreaming 101
 
How do Centralized and Distributed Version Control Systems Impact Software Ch...
How do Centralized and Distributed Version Control Systems Impact Software Ch...How do Centralized and Distributed Version Control Systems Impact Software Ch...
How do Centralized and Distributed Version Control Systems Impact Software Ch...
 
Introduction to License Compliance and My research (D. German)
Introduction to License Compliance and My research (D. German)Introduction to License Compliance and My research (D. German)
Introduction to License Compliance and My research (D. German)
 
LCA13: Why I Don't Want Your Code
LCA13: Why I Don't Want Your CodeLCA13: Why I Don't Want Your Code
LCA13: Why I Don't Want Your Code
 
ESE 2010: Using Git in Eclipse
ESE 2010: Using Git in EclipseESE 2010: Using Git in Eclipse
ESE 2010: Using Git in Eclipse
 
Version Control Systems -- Git -- Part I
Version Control Systems -- Git -- Part IVersion Control Systems -- Git -- Part I
Version Control Systems -- Git -- Part I
 
Git Tutorial
Git Tutorial Git Tutorial
Git Tutorial
 
12 tricks to avoid hackers breaks your CI / CD
12 tricks to avoid hackers breaks your  CI / CD12 tricks to avoid hackers breaks your  CI / CD
12 tricks to avoid hackers breaks your CI / CD
 
Effective Git with Eclipse
Effective Git with EclipseEffective Git with Eclipse
Effective Git with Eclipse
 
Git in gear: How to track changes, travel back in time, and code nicely with ...
Git in gear: How to track changes, travel back in time, and code nicely with ...Git in gear: How to track changes, travel back in time, and code nicely with ...
Git in gear: How to track changes, travel back in time, and code nicely with ...
 
Rooted con 2020 - from the heaven to hell in the CI - CD
Rooted con 2020 - from the heaven to hell in the CI - CDRooted con 2020 - from the heaven to hell in the CI - CD
Rooted con 2020 - from the heaven to hell in the CI - CD
 
FTP Commando to Git Hero - WordCamp Denver 2013
FTP Commando to Git Hero - WordCamp Denver 2013FTP Commando to Git Hero - WordCamp Denver 2013
FTP Commando to Git Hero - WordCamp Denver 2013
 
EclipseCon 2010 talk: Towards contributors heaven
EclipseCon 2010 talk: Towards contributors heavenEclipseCon 2010 talk: Towards contributors heaven
EclipseCon 2010 talk: Towards contributors heaven
 
Using Git Inside Eclipse, Pushing/Cloning from GitHub
Using Git Inside Eclipse, Pushing/Cloning from GitHubUsing Git Inside Eclipse, Pushing/Cloning from GitHub
Using Git Inside Eclipse, Pushing/Cloning from GitHub
 
JenkinsPy workshop
JenkinsPy workshop JenkinsPy workshop
JenkinsPy workshop
 
EclipseCon 2010 tutorial: Understanding git at Eclipse
EclipseCon 2010 tutorial: Understanding git at EclipseEclipseCon 2010 tutorial: Understanding git at Eclipse
EclipseCon 2010 tutorial: Understanding git at Eclipse
 
Chapter 8 security tools ii
Chapter 8   security tools iiChapter 8   security tools ii
Chapter 8 security tools ii
 
Linker namespace upload
Linker namespace   uploadLinker namespace   upload
Linker namespace upload
 
Understanding and Using Git at Eclipse
Understanding and Using Git at EclipseUnderstanding and Using Git at Eclipse
Understanding and Using Git at Eclipse
 

Andere mochten auch

Git/GitHub
Git/GitHubGit/GitHub
Git/GitHubMicrosoft
 
Git pavel grushetsky
Git pavel grushetskyGit pavel grushetsky
Git pavel grushetskyInna Kravchenko
 
A Business Case for Git - Tim Pettersen
A Business Case for Git - Tim PettersenA Business Case for Git - Tim Pettersen
A Business Case for Git - Tim PettersenAtlassian
 
Code Management Workshop
Code Management WorkshopCode Management Workshop
Code Management WorkshopSameh El-Ashry
 
Becoming a Git Master
Becoming a Git MasterBecoming a Git Master
Becoming a Git MasterNicola Paolucci
 
Git case of the week4212.
Git case of the week4212.Git case of the week4212.
Git case of the week4212.Shaikhani.
 
Subversion to Git Migration
Subversion to Git MigrationSubversion to Git Migration
Subversion to Git MigrationManish Chakravarty
 
Becoming a Git Master - Nicola Paolucci
Becoming a Git Master - Nicola PaolucciBecoming a Git Master - Nicola Paolucci
Becoming a Git Master - Nicola PaolucciAtlassian
 
SCM case study of Marico
SCM case study of MaricoSCM case study of Marico
SCM case study of MaricoAbhinandan Mohanty
 
Git Cards - Powerpoint Format
Git Cards - Powerpoint FormatGit Cards - Powerpoint Format
Git Cards - Powerpoint FormatAdam Lowe
 
Git from SVN
Git from SVNGit from SVN
Git from SVNJustin Yoo
 

Andere mochten auch (13)

Git representation
Git representationGit representation
Git representation
 
Git/GitHub
Git/GitHubGit/GitHub
Git/GitHub
 
Git pavel grushetsky
Git pavel grushetskyGit pavel grushetsky
Git pavel grushetsky
 
A Business Case for Git - Tim Pettersen
A Business Case for Git - Tim PettersenA Business Case for Git - Tim Pettersen
A Business Case for Git - Tim Pettersen
 
Code Management Workshop
Code Management WorkshopCode Management Workshop
Code Management Workshop
 
Git. Transition.
Git. Transition.Git. Transition.
Git. Transition.
 
Becoming a Git Master
Becoming a Git MasterBecoming a Git Master
Becoming a Git Master
 
Git case of the week4212.
Git case of the week4212.Git case of the week4212.
Git case of the week4212.
 
Subversion to Git Migration
Subversion to Git MigrationSubversion to Git Migration
Subversion to Git Migration
 
Becoming a Git Master - Nicola Paolucci
Becoming a Git Master - Nicola PaolucciBecoming a Git Master - Nicola Paolucci
Becoming a Git Master - Nicola Paolucci
 
SCM case study of Marico
SCM case study of MaricoSCM case study of Marico
SCM case study of Marico
 
Git Cards - Powerpoint Format
Git Cards - Powerpoint FormatGit Cards - Powerpoint Format
Git Cards - Powerpoint Format
 
Git from SVN
Git from SVNGit from SVN
Git from SVN
 

Ă„hnlich wie The adoption of FOSS workfows in commercial software development: the case of git and github

Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentationJavier Perez
 
[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...
[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...
[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...WSO2
 
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...sparkfabrik
 
Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...
Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...
Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...Lounge47
 
Hackaton for health 2015 - Sharing the Code we Make
Hackaton for health 2015 - Sharing the Code we MakeHackaton for health 2015 - Sharing the Code we Make
Hackaton for health 2015 - Sharing the Code we Makeesben1962
 
Making software development processes to work for you
Making software development processes to work for youMaking software development processes to work for you
Making software development processes to work for youAmbientia
 
Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...
Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...
Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...Agustin Benito Bethencourt
 
OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...
OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...
OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...Shane Coughlan
 
Long Term Support the Eclipse Way
Long Term Support the Eclipse WayLong Term Support the Eclipse Way
Long Term Support the Eclipse WayRalph Mueller
 
Open Source Governance v2.5
Open Source Governance v2.5Open Source Governance v2.5
Open Source Governance v2.5Inria
 
Best practices for using open source software in the enterprise
Best practices for using open source software in the enterpriseBest practices for using open source software in the enterprise
Best practices for using open source software in the enterpriseMarcel de Vries
 
The Evolving Role of Build Engineering in Managing Open Source
The Evolving Role of Build Engineering in Managing Open SourceThe Evolving Role of Build Engineering in Managing Open Source
The Evolving Role of Build Engineering in Managing Open SourceDevOps.com
 
All You need to Know about Secure Coding with Open Source Software
All You need to Know about Secure Coding with Open Source SoftwareAll You need to Know about Secure Coding with Open Source Software
All You need to Know about Secure Coding with Open Source SoftwareJavier Perez
 
Version Control, Writers, and Workflows
Version Control, Writers, and WorkflowsVersion Control, Writers, and Workflows
Version Control, Writers, and Workflowsstc-siliconvalley
 
SFO15-TR1: The Philosophy of Open Source Development
SFO15-TR1: The Philosophy of Open Source DevelopmentSFO15-TR1: The Philosophy of Open Source Development
SFO15-TR1: The Philosophy of Open Source DevelopmentLinaro
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine JonesJisc RDM
 
Case study
Case studyCase study
Case studykaran saini
 
Start your open source project
Start your open source projectStart your open source project
Start your open source projectAhmed Othman
 

Ă„hnlich wie The adoption of FOSS workfows in commercial software development: the case of git and github (20)

Tracing the evolution - Open source & Embedded systems
Tracing the evolution - Open source & Embedded systemsTracing the evolution - Open source & Embedded systems
Tracing the evolution - Open source & Embedded systems
 
Intro to open source - 101 presentation
Intro to open source - 101 presentationIntro to open source - 101 presentation
Intro to open source - 101 presentation
 
[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...
[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...
[Workshop] Building an Integration Agile Digital Enterprise with Open Source ...
 
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
Drupal Dev Days Vienna 2023 - What is the secure software supply chain and th...
 
Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...
Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...
Tracing The Evolution Open Source & Embedded Systems - Mr. Jayakumar Balasubr...
 
Hackaton for health 2015 - Sharing the Code we Make
Hackaton for health 2015 - Sharing the Code we MakeHackaton for health 2015 - Sharing the Code we Make
Hackaton for health 2015 - Sharing the Code we Make
 
Making software development processes to work for you
Making software development processes to work for youMaking software development processes to work for you
Making software development processes to work for you
 
Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...
Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...
Primeros pasos del Software Libre en infraestructura civil Civil Infrastructu...
 
OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...
OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...
OpenChain Webinar #58 - FOSS License Management through aliens4friends in Ecl...
 
Long Term Support the Eclipse Way
Long Term Support the Eclipse WayLong Term Support the Eclipse Way
Long Term Support the Eclipse Way
 
Open Source Governance v2.5
Open Source Governance v2.5Open Source Governance v2.5
Open Source Governance v2.5
 
Methods about Open Source Governance v2.5
Methods about Open Source Governance v2.5Methods about Open Source Governance v2.5
Methods about Open Source Governance v2.5
 
Best practices for using open source software in the enterprise
Best practices for using open source software in the enterpriseBest practices for using open source software in the enterprise
Best practices for using open source software in the enterprise
 
The Evolving Role of Build Engineering in Managing Open Source
The Evolving Role of Build Engineering in Managing Open SourceThe Evolving Role of Build Engineering in Managing Open Source
The Evolving Role of Build Engineering in Managing Open Source
 
All You need to Know about Secure Coding with Open Source Software
All You need to Know about Secure Coding with Open Source SoftwareAll You need to Know about Secure Coding with Open Source Software
All You need to Know about Secure Coding with Open Source Software
 
Version Control, Writers, and Workflows
Version Control, Writers, and WorkflowsVersion Control, Writers, and Workflows
Version Control, Writers, and Workflows
 
SFO15-TR1: The Philosophy of Open Source Development
SFO15-TR1: The Philosophy of Open Source DevelopmentSFO15-TR1: The Philosophy of Open Source Development
SFO15-TR1: The Philosophy of Open Source Development
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
Case study
Case studyCase study
Case study
 
Start your open source project
Start your open source projectStart your open source project
Start your open source project
 

Mehr von dmgerman

Fairness and Code Reviews
Fairness and Code ReviewsFairness and Code Reviews
Fairness and Code Reviewsdmgerman
 
Cregit Recovering token level authorship from Git
Cregit Recovering token level authorship from GitCregit Recovering token level authorship from Git
Cregit Recovering token level authorship from Gitdmgerman
 
The Promises and Perils of Mining Github: MSR'2014
The Promises and Perils of Mining Github: MSR'2014The Promises and Perils of Mining Github: MSR'2014
The Promises and Perils of Mining Github: MSR'2014dmgerman
 
Source Code Licensing as an Essential Aspect of Modern Software Development
Source Code Licensing as an Essential Aspect of Modern Software DevelopmentSource Code Licensing as an Essential Aspect of Modern Software Development
Source Code Licensing as an Essential Aspect of Modern Software Developmentdmgerman
 
On editing text and Emacs: 9 habits of highly effective text editing
On editing text and Emacs: 9 habits of highly effective text editingOn editing text and Emacs: 9 habits of highly effective text editing
On editing text and Emacs: 9 habits of highly effective text editingdmgerman
 
Components license
Components licenseComponents license
Components licensedmgerman
 
he Future of Continuous Integration in GNOME
he Future of Continuous Integration in GNOME he Future of Continuous Integration in GNOME
he Future of Continuous Integration in GNOME dmgerman
 

Mehr von dmgerman (7)

Fairness and Code Reviews
Fairness and Code ReviewsFairness and Code Reviews
Fairness and Code Reviews
 
Cregit Recovering token level authorship from Git
Cregit Recovering token level authorship from GitCregit Recovering token level authorship from Git
Cregit Recovering token level authorship from Git
 
The Promises and Perils of Mining Github: MSR'2014
The Promises and Perils of Mining Github: MSR'2014The Promises and Perils of Mining Github: MSR'2014
The Promises and Perils of Mining Github: MSR'2014
 
Source Code Licensing as an Essential Aspect of Modern Software Development
Source Code Licensing as an Essential Aspect of Modern Software DevelopmentSource Code Licensing as an Essential Aspect of Modern Software Development
Source Code Licensing as an Essential Aspect of Modern Software Development
 
On editing text and Emacs: 9 habits of highly effective text editing
On editing text and Emacs: 9 habits of highly effective text editingOn editing text and Emacs: 9 habits of highly effective text editing
On editing text and Emacs: 9 habits of highly effective text editing
 
Components license
Components licenseComponents license
Components license
 
he Future of Continuous Integration in GNOME
he Future of Continuous Integration in GNOME he Future of Continuous Integration in GNOME
he Future of Continuous Integration in GNOME
 

KĂĽrzlich hochgeladen

Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptxNikhil Raut
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESNarmatha D
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxsiddharthjain2303
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 

KĂĽrzlich hochgeladen (20)

Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Steel Structures - Building technology.pptx
Steel Structures - Building technology.pptxSteel Structures - Building technology.pptx
Steel Structures - Building technology.pptx
 
Industrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIESIndustrial Safety Unit-I SAFETY TERMINOLOGIES
Industrial Safety Unit-I SAFETY TERMINOLOGIES
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Energy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptxEnergy Awareness training ppt for manufacturing process.pptx
Energy Awareness training ppt for manufacturing process.pptx
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 

The adoption of FOSS workfows in commercial software development: the case of git and github

  • 1. The adoption of FOSS workfows in commercial software development: the case of git and github Daniel M German University of Victoria Canada
  • 2.
  • 3. Open Source is everywhere
  • 4. On SSL and Heartbleed “[Heartbleed] is a software faw that has left up to two-thirds of the world’s websites vulnerable to attack by hackers.” – The Economist
  • 5. “There is no such thing as bad publicity except your own obituary.” – Brendan Behan
  • 6. â—Ź “Most open-source software – and Open SSL is no exception – is produced voluntarily by people who are not paid for creating it. They do it for love, professional pride or as a way of demonstrating technical virtuosity. And mostly they do it in their spare time.” – John Naughton The Observer/The Guardian 'Heartbleed' bug can't be simply blamed on coders, April 13, 2014
  • 7. “Responsible corporate use of open-source software should therefore involve some measure of reciprocity: a corporation that benefts hugely from such software ought to put something back, either in the form of fnancial support for a particular open- source project, or – better still – by encouraging its own software people to contribute to the project.”
  • 8. “Much of the invisible backbone of websites from Google to Amazon to the Federal Bureau of Investigation was built by volunteer programmers in what is known as the open-source community.”
  • 9. “... volunteers, connected over the Internet, work together to build free software, to maintain and improve it and to look for bugs. Ideally, they check one another’s work in a peer review system similar to that found in science.”
  • 10. Linus Law: “Given Enough Eyeballs, all Bugs are Shallow” Eric Raymond, The Cathedral and the Bazaar
  • 11. In the case of Heartbleed “There weren't enough eyeballs” - Eric Raymond,
  • 12. â—Ź Code was created by a grad student â—Ź Reviewed by S. Henson, core developer of OpenSSL â—Ź Included in OpenSSL in the Spring 2011 â—Ź Not discovered for 3 years! Budget of openSSL: – US$2,000 for 2013
  • 13. the OpenSSL problem â—Ź important infrastructure projects that are run by small teams of volunteers â—Ź on April 24, the Linux Foundation announces the “Core Infrastructure Initiative” to address it
  • 14. Core Infrastructure Initiative â—Ź Funded by: – Amazon, Cisco, Dell, Facebook, Fujitsu, Google, IBM, Intel, Microsoft, NetApp, Rackspace,Qualcomm, VMware and The Linux Foundation â—Ź Funding to core projects: – Fellowships to core developers – as well as other resources to assist the project in improving its security, enabling outside reviews, and improving responsiveness to patch requests.
  • 15. What is FOSS development? â—Ź Most important feature of FOSS – its free or open source license â—Ź License – Guarantees code is available to others to reuse – Becomes a social contract among participants
  • 16. What is OSS development? â—Ź Most frequently defned as: – Self organized teams developing software without a central authority â—Ź Code is open for review – and reuse!!! â—Ź Anybody can participate
  • 17. What makes OSS development possible? â—Ź Teams of self-organized developers and contributors â—Ź The Internet â—Ź A common toolkit â—Ź Version control systems
  • 18. Teams â—Ź Come from all sectors: – Professionals and hobbyists – Paid and volunteers – Novices and Experienced – High-school students to PhDs – All over the world!!! â—Ź Highly motivated!
  • 19. Common Toolkit â—Ź To be able to collaborate you need a common set of tools – Programming languages â—Ź gcc, perl, python, java, ruby, lua, php... – Editors and IDEs â—Ź Emacs, vim, Eclipse, Netbeans... – Libraries â—Ź boost, maven, cpan, Pypi... – Infrastructure â—Ź Make, ant, cmake, bugzilla, etc. – Hosting infrastructure â—Ź Sourceforge, Google Code, github, bitbucket â—Ź They must be available at zero cost to anybody
  • 20. FOSS Toolkit â—Ź I posit that one of the biggest infuences of FOSS on the practice of Software Development is the wide use of FOSS tools for the development of software – Most implementations of popular programming languages today are open source – FOSS Editors and IDEs are widely used too
  • 21. Free Software Foundation â—Ź The FSF had to boostrap the development of the OSS toolkit – To build an Operating System you need a compiler – Before you build a compiler you need an editor, but you need an editor to build a compiler – gcc, emacs, bintools (ls, echo, cat, etc.), etc
  • 22. Richard Stallman Created the legal and technical infrastructure for Free and Open Source software
  • 24. Need for Code Reviews â—Ź Many FOSS teams discovered that to ship good quality software they needed to review the source code
  • 25. Fagan Code Inspections â—Ź Code reviews performed at specifc stages of development Effective, but not widely used
  • 26. Open Source style Code Reviews â—Ź Fagan inspections were unfeasible – Required participants to be in the same room â—Ź Instead, code reviews started to be incremental – Rather than reviewing the whole, review the delta (the patch)
  • 28. the spectrum of Code Reviews
  • 29. code reviews in FOSS (1) early, frequent reviews (2) of small, independent, complete contributions (3) that are broadcast to a large group of stakeholders, but only reviewed by a small set of self-selected experts (4) resulting in an effcient and effective peer review technique. - Peter Rigby
  • 31.
  • 32.
  • 33.
  • 34.
  • 36. Version Control Systems â—Ź At the beginning, FOSS used tar fles in USENET – the FSF would ship physical tapes! â—Ź Today, version control systems are the norm – Centralized or Distributed â—Ź FOSS has a continuous and proven track of innovation in version control systems – FOSS democratized VC
  • 37. On Version Control â—Ź The VC is the circulatory system of a software development â—Ź It brings the code to all stakeholders â—Ź A contribution is a patch – one or more commits
  • 38. the patch â—Ź the patch should be reviewed â—Ź most VCs don't support reviewing of patches
  • 39. the patch and its review â—Ź Two models: – Commit then Review â—Ź Review the code after it has been integrated or – Review Then Commit (RTC) â—Ź Review the patch before it is integrated
  • 40. Linux â—Ź Linux incorporated RTC early in its process â—Ź Linus needed integration of Review process with VC â—Ź No FOSS VC did it – he turned to bitkeeper
  • 41. Bitkeeper and Linux â—Ź Symbiotic relationship – Free (as in beer) licenses to linux developers with one big condition â—Ź User should not develop competing tools – Bitkeeper rapidly improved Linux integration process â—Ź simplifed integration of reviewed code – Bitkeeper was probably infuenced by Linus workfow – in 2005 bitkeeper revokes its license to Linux developers
  • 42. Git â—Ź Many other distributed version control systems before it â—Ź What makes it special? – Many features, but specially: â—Ź Pull-requests â—Ź git incorporates code review process with a distributed version control system – Even via email patches
  • 43. How is distributed version control software being used?
  • 44. Git â—Ź Software engineers are moving towards git – And other DVCs â—Ź Github a major reason
  • 45. The Promise of Git From: http://thkoch2001.github.io/whygitisbetter/
  • 46.
  • 47. Challenge 1 â—Ź Personal repos are beyond reach â—Ź Local commits might never be observable
  • 48. “History is written by the victors” Challenge 2: History
  • 50. Save history before it is lost!
  • 51. Super-repository â—Ź Collection of repositories cloned (recursively) from the same repo – At least one per developer â—Ź In their personal computer – At least one public repository â—Ź The blessed – In git, no way to trace them
  • 52. Moving commits across the superRepo Method Push Done at source, needs write access to destination Pull Done at destination, needs read access to source Email Source creates patch mails it; recipient applies it
  • 54. Can we learn from Linux?
  • 55. Life of a Patch in Linux
  • 56. ContinuousMining of Linux â—Ź Linux has no centralized logging – Nobody really knows what the superRepo is – Commits fow without any event broadcasting mechanism â—Ź Who do we fnd the activity? – Repos – Commits
  • 57. Semiautomatic Process â—Ź Every 3 hrs, ask every repo – What new commits do you have? – What commits did you delete? – Automatically resolve propagations â—Ź Commits might propagate before we scan â—Ź Daily: – Are commits in repo by unknown committers? â—Ź Answer: – is there a new repo? or is committer new to repo?
  • 58. Implementation â—Ź Running since Nov. 2011 – Currently scans 650 repos every 3 hrs – Retrieved â—Ź 2.3 million commits (compared to 400k in Linus repo) â—Ź 109 million records in propagation table <commit-id, added|deleted, repo, when>
  • 59. Snapshot (Linus) Continuous No Repos 1 479 Commits 64k 533k Non-merge Commits 59k 485k Unique Non-merges 58k 135k %unique non-merges 98.9% 27.9% Non-merges that reached Blessed 43.1% Different authors emails 3434 5646 Different authors 2883 4575 Different committers emails 283 1185 Different committers 245 1058
  • 60. Commit vs Patches â—Ź Commit ids are insuffcient to tracks patches â—Ź Large amount of work not reaching blessed
  • 61. Arrival of Commits at Blessed
  • 62. Arrival of Commits at Blessed... â—Ź We can classify patches as a new feature or bug-fx
  • 63. The Latency Time of Authorship Time of Commit
  • 66. â—Ź Large ecosystem of repositories – Producers – Consumers
  • 68. Linux Dashboard â—Ź We asked two linux maintainers: – Can this info be useful? â—Ź Answer: – “Yes” … but not for what we expected...
  • 69. Tracking commits in Linux â—Ź Need to track patches, not commits – Particularly important in consumer repositories – Need to cross-reference commits â—Ź What commits contain the same patch? – Some repos track commits from blessed via cherry-picking â—Ź Commit ids are useless â—Ź So they annotate log with the origin commit id
  • 70. Linux Commits Dashboard â—Ź Where is my commit? – My original commit, has it reached Linus? â—Ź What was merged? – What commits were merged at once by Linus? â—Ź What commits are related to this one? – Same patch â—Ź Rebasing â—Ź Cherry picking – Mentioned in a commit â—Ź This commit fxes bug introduced in X â—Ź This commit reverts commit X â—Ź http://o.cs.uvic.ca:20810/perl/cid.pl?cid=70cb8bb0d365f0bc8b20fa67347caf9598a4674e â—Ź
  • 71.
  • 72.
  • 73.
  • 74. Researcher states: “40% of pull requests are not merged” â—Ź Based on simply querying ghtorrent data â—Ź But it ignores what really happens â—Ź Many pull requests are merged without being marked as merged in github â—Ź Ghtorrent data has many potential threats to validity
  • 75. What is github used for?
  • 76. "I store my presentations in github. I don't need a USB stick anymore!"
  • 77.
  • 78.
  • 79. Are there potential threats to validity for studies that assume github is about software engineering only?
  • 80. Methodology â—Ź Data sources: – Surveys – Sampling of repositories â—Ź Mixed methods: – Quantitative, and – Qualitative
  • 81.
  • 82. I. A repository is not necessarily a project II. Most projects have few commits III. Most projects are innactive IV. A large proportion of repositories are not for software engineering V. More than two thirds of projects are personal VI. Only a fraction of repos use pull requests VII. If the commits in a pull-request are reworked, github only records the resulting patch VIII.Most pull-requests appear as non-merged, even though they were merged IX. Many active projects do not conduct all their sotfware development activity in github
  • 83. Uses:
  • 84. Most projects are inactive
  • 85. Social? 67% of projects are personal repos 95% have 3 or less committers
  • 86. Self contained? “Any serious project would have to have some separate infrastructure - mailing lists, forums, irc channels and their archives, build farms, etc. [...] Thus while GitHub and all other project hosts are used for collaboration, they are not and can not be a complete solution.”
  • 87. Others are already using github's information to reach conclusions!
  • 88. the open source report card http://osrc.dfm.io/dmgerman/
  • 89.
  • 90. how are github users collaborating?
  • 91. How does github suppot collaboration? â—Ź Methodology: – Survey â—Ź 240 responses (24% response rate) – Interviews â—Ź 35 interviews from survey respondents – 71% professional developers – 11% managers – 9% students – 9% interns â—Ź Approximately 1hr each
  • 92. Survey: why do you use github?
  • 94. Themes: focus â—Ź Simple tools – git branching/merging – github features seem to be enough for most â—Ź Pull requests and issue tracking â—Ź Focused interaction – code-centric, focused communication – asynchronous and unobtrusive â—Ź
  • 95. Focus: independence â—Ź Decentralized work: – git allows them to work independently – yet they have visibility of what others do â—Ź Low need for management: – Need for a clear process (the workfow) – They shy away from rigid management and team structure – Team managers recognize this – Managers should be educated on using git/github
  • 96. Focus: Exposure â—Ź Easy contribution process – Fork and potentially contribute without pre- authorization â—Ź Peer pressure – Developers are conscious that their code is readily visible to others – Adoption of small, frequent contributions
  • 97. OSS mentality â—Ź At the operational level – the nature of the work allows independence and self- organization. – developers are familiar with the idea of working this way and share the mentality behind it. â—Ź developers are self-driven â—Ź share the mentality of – self- organizing, – minimizing communication and coordination needs, – having ownership of code, and – operating on a meritocratic, expertise-based model
  • 99. The Github Ecosystem â—Ź github is creating an ecosystem of proprietary, cloud enabled applications for software development teams – Service integration – JSON API â—Ź Asana, Campfre, Lighthouse, Jira, Travis, Trello, etc, etc.
  • 100.
  • 101. Conclusions â—Ź git and github are promoting the use of the pull- request workfow – small, independent contributions – that can be reviewed before integration â—Ź Effectively, adopting open source code practices into their development – Independent work – Code reviews of contributions before they are integrated