SlideShare a Scribd company logo
1 of 12
Using Git/Gerrit and Jenkins to Manage the Code Review Process


ESC-4024


Presenters : Marc Karasek & Phil Hord

Code Review – What is it and why do we do it?

The idea of the lone ranger programmer, cranking out code in his cube/office, is a nice romantic idea. In
reality it only leads to code that is obfuscated and unmaintainable. Having a code review process as
part of your development flow however, leads to more maintainable code.

The ‘ideal’ code review system:

    1.   Web interface – Allow access from multiple development sites.
    2.   Allows pre-commit code reviews
    3.   Can handle a large number of repositories
    4.   Inline comments and block comments
    5.   Integration with a build server.
    6.   Review Process Workflow that can be integrated into the development process –
         Developer does not have to do anything “extra” to start review process.

Let’s take each of these in order:


Web Interface
Today we have development teams spread around the world. The old adage the sun never sets on the
British Empire could be applied to some of our current development teams. With developers not
always located in the same geographical/time zone area, it becomes important to have a web interface
to allow code review to be a process that is not dependent on sitting around a table. A developer can
submit his code for review and when his teammate gets into his office, he can review his code.


Pre-commit code reviews
One of the biggest problems facing code review is how to satisfy both requirements to have the code
under SCM, and at the same time not impact any current code base with pre-reviewed code. There are
many ways to implement this, having a separate (sandbox) repository for untested/unreviewed code
and submitting patchset(s) for changes into a SCM are a couple of ways. The problem here is that most
of these methods add overhead to your development process. Having to maintain two repos, one for
production one for development or adding additional steps to the development process to create the
patchset for a change to be reviewed.


Can handle a large number of repositories
Development teams today work on multiple projects; each one normally has its own code base that
needs to be maintained. Being able to maintain a large number of different repositories, while not a
major issue with SCM systems today, are worth mentioning. It can become an issue in how the SCM
stores the repository and how much space on the server it takes.


Inline comments and block comments
This is important to allow reviewers, to not only comment on the actual change itself, but add
comments inline in the patchset/code that is being reviewed. Think of this as a global comment on the
change, “The commit message needs to have some more verbiage added to describe the change better”
versus a local comment in the code, “This variable is being used in file ABCD.c, check this file to make
sure we do not have an issue.” Both types of comments, inline and block, should be part of the code
review history/process.


Integration with a build server
Projects that share code across platforms and need to be able to cross check common code for multiple
build targets. Having a build server that can do the ‘grunt’ work of building multiple targets for a code
base puts a check in place that is not dependent on a developer doing the builds. With some projects
having many targets, having a build server helps to automate and standardize the process.


Review Process Workflow that can be integrated into the development
process.
The trick is to integrate the code review so it is a part of your ‘normal’ code development process. If
there is any “exception” path that allows engineers to bypass code review for emergencies, this will
become the normal path. From the developer’s point of view, the code review process should have a
minimal impact on the development process. The best case is that the developer normal check/commit
process for submitting code into the SCM is the code review process.
Current Processes

Most code review systems/processes generally fall into one of three models:

   1. Code is checked into a temporary holding branch for review. Once it has been reviewed, it is
      then merged into a master/release branch. This merge maybe could be done by either the
      original developer or a dedicated build/repository manager.
   2. Code is kept locally on the developer’smachine; it is posted/emailed for review. Once it is has
      been reviewed it is the responsibility of the developer to merge this into the release/master
      branch.
   3. Separate branches are maintained for release and development. The development branch is
      never guaranteed to build but always has the latest and greatest in it. Code maybe checked into
      this branch with no review. Once checked in, reviewers are notified and provided a link to the
      commit for review.

Each of these processes has its good and bad points. What all of them lack is a way to automate the
review process. These includes

   1. being able to cherry-pick/pull a patchset to a local repository for review/testing
   2. review the changes w/o pulling down the code to your local machine
   3. review the history of this change
          a. how many times has it been through the review process
          b. what other reviewers comments are
Let us see how the above processes stack up against the ‘ideal’ code
review system.


Web Interface
All of the above could have some kind of web interface for accessing the code under review. This could
be as simple as a patchset sent via email, to a web based gui. Regardless of the method, this adds an
extra step in the development process. The engineer has to package his changes into a patchset, and
then either send it out an email list or post it to a web site. This adds time to the development process
and does not allow good tracking of review changes. The normal process would be for the developer to
receive feedback, generate a new patchset and then send/post this new change. There is no explicit
link between the old and new changes.


Pre-commit code reviews
Only some of the above handle this requirement, #1 and #3. For these two the code is checked into a
holding area/development branch for review, prior to being merged over to the release/master. #2
fails this requirement, as the change only live on the developer’s machine and if it has an ‘accident’ then
the changes are lost.

Even the ones that meet this requirement have problems. As in the previous requirement this adds to
the development process. The code needs to be merged over, after review. This is either handled by
the developer or by a dedicated build manger. At the end it is a manual step that adds time and takes
up resources.


Can handle a large number of repositories
Most modern SCM systems handle large repositories. This impacts the review process very little and is
best left for a separate discussion.


Inline comments and block comments
Most current review processes fail this requirement. Being able to view other reviewer’s comments on
a file or about the overall change is an invaluable resource that helps to streamline the review process.
Also being able to review past comments for this change, no one gets it right the first time that is why
we do code review, also leads to shorter review time.
Integration with a build server
This is normally a manual step in the review process, where a developer has to submit his job to the
build server. At the best it is somewhat automated, in a nightly weekly build that pulls all current
submitted changes in and attempts to build them.

Where this fails is that for all of the processes only #1 above, where the change is contained in its own
repository couldbuild. For #3, the development branch is never guaranteed buildable. So for a vast
majority of the time this adds time to the process. Someone has to go find out why the nightly/weekly
development build fails, inform the engineer that submitted the code, etc. For #2, there is no way for
the build server to get the code as it is on the developer’s machine.


Review Process Workflow that can be integrated into the development
process.
For all three of the above, each one adds additional steps into the process. For the developer it is a
multistep process to get his code submitted. They have to learn a ‘new’ process and how to use this
process in their development. For example: how to properly generate the patchset so that it can be
reviewed by the team or how to package their changes to submit them through a web interface for
review.
Introducing : Git / Gerrit / Jenkins

Using git as a SCM with gerrit as a frontend addresses most of the above requirements. Adding Jenkins
as a build/integration server covers the requirements using git/gerrit alone do not.


Web Interface
Gerrit provides a web interface that allows code review, patchset generation, cherry-picking, etc. of
patchsets that have been submitted for review. Access to this web interface and the underlying
repositories can be access controlled so that developers only have access to the projects that they are
working on.

It allows for a custom view of the patchset under review. A reviewer can choose to view any number of
lines that surround the change, up to the whole file. This allows each reviewer to view as much
information as they need, without having to check out any code.
Pre-commit code reviews
This one item is worth using git/gerrit. Using gerrit as a frontend provides a ‘standard’ git interface to
the developers. They push there code to the git server, no special check in process, no special software
to install. The developer just pushes their code to a tag “refs/for/<branch>” that gerrit understands.
gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review.




The patchset is ‘held’ in gerrit until the code has been reviewed. It then can be submitted into the git
repository. This patchset can be updated, abandoned, resurrected, etc. all without impacting the git
repository that it has been pushed to. This allows for changesets to be in review and pending without
impacting the code base. The patchset can also be updated by the developer based on comments
during review. They make the requested changes and just push the same commit to the git server.
Gerrit sees that this is a new patchset based on a previous one and adds it to the review as patchset<x>.
Can handle a large number of repositories
All modern SCM systems can handle multiple repositories. Where git stands out though is in the size of
the repository and how it stores the files.

For example the Mozilla repository is reported to be almost 12 Gb when stored in SVN using the fsfs
backend. Previously, the fsfs backend also required over 240,000 files in one directory to record all
240,000 commits made over the 10 year project history. The exact same history is stored in git by only
two files totaling just over 420 Mb. This means that SVN requires 30x the disk space to store the same
history.

One of the reasons for the smaller repo size is that an SVN working directory always contains two copies
of each file: one for the user to actually work with and another hidden in .svn/ to aid operations such as
status, diff and commit. In contrast a git working directory requires only one small index file that stores
about 100 bytes of data per tracked file. On projects with a large number of files this can be a
substantial difference in the disk space required per working copy.

This same comparison can be made between git and cvs, where a 3x improvement in disk space usage
has been seen.

A side effect of how git manages its repository is that each time you clone a repository locally you get
the full repository. All the history, etc. is cloned to the local machine from the server. This allows for
developers to work on code and switch between branches, search history, etc. without having to be
physically attached to the ‘central’ SCM.


Inline comments and block comments
Gerrit allows the reviewer(s) to enter both inline and block comments on any patchset they are
reviewing. It also keeps a history of the patchset as it goes through the review process. This gives the
reviewer/developer the ability to access the past history of comments on the change.
Integration with a build server
This is where the three amigos meet. Jenkins (build server) has built-in hooks to monitor and build
against a gerrit/git SCM system. This allows for automated builds to happen as a trigger event based on
a patchset being submitted into gerrit. The developer does not have to do anything special to trigger
this event; it is automatic based on the patchset and which branch it is being pushed to in gerrit/git.

This can be used to build a set group of targets based on a given branch, or all of the targets that a given
project builds for.


Review Process Workflow that can be integrated into the development
process.
This is where the rubber hits the road. Using gerrit/git allows the review process to be fully integrated
into the development process. The developer does not have to learn any new process, they just push
their changes to git and gerrit takes care of the magic

The developer pushes there code to the git server, no special check in process, no special software to
install. The code is pushed to a special tag “refs/for/<branch>” that gerrit understands. Gerrit then
takes the changes and creates a patchset from it and posts it to its web interface for review. It then
emails out to whoever is on the review list that a new review is in their queue. When the reviewer(s) log
into gerrit, they see the patchset they have been asked to review in their queue.
Reference Links :

     http://git-scm.com/



     https://code.google.com/p/gerrit/



     http://jenkins-ci.org/

     https://wiki.jenkins-ci.org/display/JENKINS/Gerrit+Trigger



     http://hudson-ci.org/

More Related Content

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Using Git/Gerrit and Jenkins to Manage the Code Review Process

  • 1. Using Git/Gerrit and Jenkins to Manage the Code Review Process ESC-4024 Presenters : Marc Karasek & Phil Hord Code Review – What is it and why do we do it? The idea of the lone ranger programmer, cranking out code in his cube/office, is a nice romantic idea. In reality it only leads to code that is obfuscated and unmaintainable. Having a code review process as part of your development flow however, leads to more maintainable code. The ‘ideal’ code review system: 1. Web interface – Allow access from multiple development sites. 2. Allows pre-commit code reviews 3. Can handle a large number of repositories 4. Inline comments and block comments 5. Integration with a build server. 6. Review Process Workflow that can be integrated into the development process – Developer does not have to do anything “extra” to start review process. Let’s take each of these in order: Web Interface Today we have development teams spread around the world. The old adage the sun never sets on the British Empire could be applied to some of our current development teams. With developers not always located in the same geographical/time zone area, it becomes important to have a web interface to allow code review to be a process that is not dependent on sitting around a table. A developer can submit his code for review and when his teammate gets into his office, he can review his code. Pre-commit code reviews One of the biggest problems facing code review is how to satisfy both requirements to have the code under SCM, and at the same time not impact any current code base with pre-reviewed code. There are many ways to implement this, having a separate (sandbox) repository for untested/unreviewed code and submitting patchset(s) for changes into a SCM are a couple of ways. The problem here is that most of these methods add overhead to your development process. Having to maintain two repos, one for
  • 2. production one for development or adding additional steps to the development process to create the patchset for a change to be reviewed. Can handle a large number of repositories Development teams today work on multiple projects; each one normally has its own code base that needs to be maintained. Being able to maintain a large number of different repositories, while not a major issue with SCM systems today, are worth mentioning. It can become an issue in how the SCM stores the repository and how much space on the server it takes. Inline comments and block comments This is important to allow reviewers, to not only comment on the actual change itself, but add comments inline in the patchset/code that is being reviewed. Think of this as a global comment on the change, “The commit message needs to have some more verbiage added to describe the change better” versus a local comment in the code, “This variable is being used in file ABCD.c, check this file to make sure we do not have an issue.” Both types of comments, inline and block, should be part of the code review history/process. Integration with a build server Projects that share code across platforms and need to be able to cross check common code for multiple build targets. Having a build server that can do the ‘grunt’ work of building multiple targets for a code base puts a check in place that is not dependent on a developer doing the builds. With some projects having many targets, having a build server helps to automate and standardize the process. Review Process Workflow that can be integrated into the development process. The trick is to integrate the code review so it is a part of your ‘normal’ code development process. If there is any “exception” path that allows engineers to bypass code review for emergencies, this will become the normal path. From the developer’s point of view, the code review process should have a minimal impact on the development process. The best case is that the developer normal check/commit process for submitting code into the SCM is the code review process.
  • 3. Current Processes Most code review systems/processes generally fall into one of three models: 1. Code is checked into a temporary holding branch for review. Once it has been reviewed, it is then merged into a master/release branch. This merge maybe could be done by either the original developer or a dedicated build/repository manager. 2. Code is kept locally on the developer’smachine; it is posted/emailed for review. Once it is has been reviewed it is the responsibility of the developer to merge this into the release/master branch. 3. Separate branches are maintained for release and development. The development branch is never guaranteed to build but always has the latest and greatest in it. Code maybe checked into this branch with no review. Once checked in, reviewers are notified and provided a link to the commit for review. Each of these processes has its good and bad points. What all of them lack is a way to automate the review process. These includes 1. being able to cherry-pick/pull a patchset to a local repository for review/testing 2. review the changes w/o pulling down the code to your local machine 3. review the history of this change a. how many times has it been through the review process b. what other reviewers comments are
  • 4. Let us see how the above processes stack up against the ‘ideal’ code review system. Web Interface All of the above could have some kind of web interface for accessing the code under review. This could be as simple as a patchset sent via email, to a web based gui. Regardless of the method, this adds an extra step in the development process. The engineer has to package his changes into a patchset, and then either send it out an email list or post it to a web site. This adds time to the development process and does not allow good tracking of review changes. The normal process would be for the developer to receive feedback, generate a new patchset and then send/post this new change. There is no explicit link between the old and new changes. Pre-commit code reviews Only some of the above handle this requirement, #1 and #3. For these two the code is checked into a holding area/development branch for review, prior to being merged over to the release/master. #2 fails this requirement, as the change only live on the developer’s machine and if it has an ‘accident’ then the changes are lost. Even the ones that meet this requirement have problems. As in the previous requirement this adds to the development process. The code needs to be merged over, after review. This is either handled by the developer or by a dedicated build manger. At the end it is a manual step that adds time and takes up resources. Can handle a large number of repositories Most modern SCM systems handle large repositories. This impacts the review process very little and is best left for a separate discussion. Inline comments and block comments Most current review processes fail this requirement. Being able to view other reviewer’s comments on a file or about the overall change is an invaluable resource that helps to streamline the review process. Also being able to review past comments for this change, no one gets it right the first time that is why we do code review, also leads to shorter review time.
  • 5. Integration with a build server This is normally a manual step in the review process, where a developer has to submit his job to the build server. At the best it is somewhat automated, in a nightly weekly build that pulls all current submitted changes in and attempts to build them. Where this fails is that for all of the processes only #1 above, where the change is contained in its own repository couldbuild. For #3, the development branch is never guaranteed buildable. So for a vast majority of the time this adds time to the process. Someone has to go find out why the nightly/weekly development build fails, inform the engineer that submitted the code, etc. For #2, there is no way for the build server to get the code as it is on the developer’s machine. Review Process Workflow that can be integrated into the development process. For all three of the above, each one adds additional steps into the process. For the developer it is a multistep process to get his code submitted. They have to learn a ‘new’ process and how to use this process in their development. For example: how to properly generate the patchset so that it can be reviewed by the team or how to package their changes to submit them through a web interface for review.
  • 6. Introducing : Git / Gerrit / Jenkins Using git as a SCM with gerrit as a frontend addresses most of the above requirements. Adding Jenkins as a build/integration server covers the requirements using git/gerrit alone do not. Web Interface Gerrit provides a web interface that allows code review, patchset generation, cherry-picking, etc. of patchsets that have been submitted for review. Access to this web interface and the underlying repositories can be access controlled so that developers only have access to the projects that they are working on. It allows for a custom view of the patchset under review. A reviewer can choose to view any number of lines that surround the change, up to the whole file. This allows each reviewer to view as much information as they need, without having to check out any code.
  • 7. Pre-commit code reviews This one item is worth using git/gerrit. Using gerrit as a frontend provides a ‘standard’ git interface to the developers. They push there code to the git server, no special check in process, no special software to install. The developer just pushes their code to a tag “refs/for/<branch>” that gerrit understands. gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review. The patchset is ‘held’ in gerrit until the code has been reviewed. It then can be submitted into the git repository. This patchset can be updated, abandoned, resurrected, etc. all without impacting the git repository that it has been pushed to. This allows for changesets to be in review and pending without impacting the code base. The patchset can also be updated by the developer based on comments during review. They make the requested changes and just push the same commit to the git server. Gerrit sees that this is a new patchset based on a previous one and adds it to the review as patchset<x>.
  • 8. Can handle a large number of repositories All modern SCM systems can handle multiple repositories. Where git stands out though is in the size of the repository and how it stores the files. For example the Mozilla repository is reported to be almost 12 Gb when stored in SVN using the fsfs backend. Previously, the fsfs backend also required over 240,000 files in one directory to record all 240,000 commits made over the 10 year project history. The exact same history is stored in git by only two files totaling just over 420 Mb. This means that SVN requires 30x the disk space to store the same history. One of the reasons for the smaller repo size is that an SVN working directory always contains two copies of each file: one for the user to actually work with and another hidden in .svn/ to aid operations such as status, diff and commit. In contrast a git working directory requires only one small index file that stores about 100 bytes of data per tracked file. On projects with a large number of files this can be a substantial difference in the disk space required per working copy. This same comparison can be made between git and cvs, where a 3x improvement in disk space usage has been seen. A side effect of how git manages its repository is that each time you clone a repository locally you get the full repository. All the history, etc. is cloned to the local machine from the server. This allows for developers to work on code and switch between branches, search history, etc. without having to be physically attached to the ‘central’ SCM. Inline comments and block comments Gerrit allows the reviewer(s) to enter both inline and block comments on any patchset they are reviewing. It also keeps a history of the patchset as it goes through the review process. This gives the reviewer/developer the ability to access the past history of comments on the change.
  • 9.
  • 10. Integration with a build server This is where the three amigos meet. Jenkins (build server) has built-in hooks to monitor and build against a gerrit/git SCM system. This allows for automated builds to happen as a trigger event based on a patchset being submitted into gerrit. The developer does not have to do anything special to trigger this event; it is automatic based on the patchset and which branch it is being pushed to in gerrit/git. This can be used to build a set group of targets based on a given branch, or all of the targets that a given project builds for. Review Process Workflow that can be integrated into the development process. This is where the rubber hits the road. Using gerrit/git allows the review process to be fully integrated into the development process. The developer does not have to learn any new process, they just push their changes to git and gerrit takes care of the magic The developer pushes there code to the git server, no special check in process, no special software to install. The code is pushed to a special tag “refs/for/<branch>” that gerrit understands. Gerrit then takes the changes and creates a patchset from it and posts it to its web interface for review. It then emails out to whoever is on the review list that a new review is in their queue. When the reviewer(s) log into gerrit, they see the patchset they have been asked to review in their queue.
  • 11.
  • 12. Reference Links : http://git-scm.com/ https://code.google.com/p/gerrit/ http://jenkins-ci.org/ https://wiki.jenkins-ci.org/display/JENKINS/Gerrit+Trigger http://hudson-ci.org/