SlideShare ist ein Scribd-Unternehmen logo
1 von 62
1 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Google Gerrit User Summit 2015
How to Properly Tune and Size your Gerrit Backend
Johannes Nicolai
Director of Engineering, CollabNet
Thursday, June 2, 2016
2 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
The CollabNet Gerrit Team
3 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Berlin Hackathon
4 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
I want you for
the Gerrit Hackathon
In Spring 2016!
5 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
TeamForge – A Full Development, Delivery and Collaboration System
codecode
planplan
test test
release release
deploy deploy
monitormonitor
operate operate
build
build
Governance, traceability, and IP security across tools, assets, processes, and teams
Scalability
planreview
6 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
CollabNet TeamForge – Integrated Tools
7 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
http://blogs.collab.net/git
8 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
CollabNet Customers & Some Gerrit Stats
• Some of our customers have
– More than 5 million Git fetch requests daily
– More than 10 Gerrit master servers
– More than 40 different “geographies” and replication servers
– More than 100,000 active developers
– More than 10 TB of source code
– More than 20,000 repositories
– More than 1000 CI servers
Financial Healthcare Global Services Technology,
Software and IoT
Government and
Aerospace
9 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Git Sizing / Performance Tuning - FAQ
• How many servers will I need?
• Which cloning protocols to offer?
• How to set those gazillion gerrit.config options?
• How many CPUs and how much RAM will I need?
• What the heck is pack size?
• How often should you run garbage collection?
• Does it make any difference whether I go with a native Git or
JGit based backend?
• How do you handle hundreds of polling CI users without
compromising performance for your human end users?
• What about clustering and replication?
10 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
It Depends …
One Size Fits All?
11 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
One Size does not fit all.
12 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Status Quo on Gerrit Performance Tuning Info
• https://code.google.com/p/gerrit/wiki/Scaling
• Gerrit Mailing list
• Tech Talks
• Some very generic, mentioning only upper limits
– “some larger installations use 48 cores”
– “at least one has 1 TB RAM”
• Other very specific, requiring test env with production load
– “Your luck may vary with tweaking your jvm gc parameters. You may find
that increasing the size of the young generation may help drastically
reduce the amount of gc thrashing your server performs.”
• Depend on hard to measure metrics
– # parallel fetch requests
13 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Typical ops persona
• Jack of all trades
• Responsible for dozens of applications
• No Gerrit expert knowledge
• No Java expert knowledge
• Basic Git knowledge
• No access to special HW
• Limited test bed (not same load pattern as on production)
• No access to Gerrit multi master / GFS technology
• No overview about all their user base (tens of thousands of
developers in different geographies)
14 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Challenge
• Tuning advice that is
– actionable
– not “one size fits all”
– targeted at ops people with no expert Gerrit / JVM knowledge
– only uses easy to measure factors for its recommendations
– does not require special HW or test beds
– not depending on proprietary Gerrit extensions/technology
• Keep Motivation up to go through all of this
15 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Disclaimer
• Some advice shown next is debatable & over simplified
• Eager for your feedback in Q&A session / lunch break
• However all recommendations have been verified at our
customers & our performance lab
16 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Motivation (TeamForge == Gerrit 2.10.6)
Numbers from http://bit.ly/1WxSiw9
17 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Motivation
18 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Motivation
19 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Gerrit Performance Tuning in 5 Steps
1. Get your numbers
2. Size your hardware
3. Tune your gerrit.config
4. Configure Garbage collection
5. Deal with heavy CI load
S M L
20 Copyright ©2015 CollabNet, Inc. All Rights Reserved.20 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
1. Get Your Numbers
21 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
• The number of users is only an
indirect factor for Gerrit tuning as
most Git operations are done
completely offline.
• The more users you have, the
more repositories and push/fetch
requests you will probably
encounter.
• The majority of load is typically
caused by build systems (CI). The
biggest enterprise instance we
have seen has 15k active users.
Number Of Users
22 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
• The number of repositories (Gerrit
projects) determines how much
disk space you need.
• We have seen instances with
more than 10k repositories but
would not recommend more than
2500 per server.
Number Of Repositories
23 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
• ssh allows you to use public key
cryptography which is stronger
than passwords
• ssh is recommended for CI users
as this allows push based
notifications (see step 5).
• http(s) seems to perform better if
the majority of the operation time
is the connection request itself
(not much data transferred, no
heavy IO)
• Hybrid approaches are possible
Protocol
24 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
ssh vs https
25 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
ssh vs https
26 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
• Repository size determines the
amount of storage you need on
disk. In addition, it influences the
needed memory during a clone
request as pack files have to be
loaded and streamed.
• The largest repository on disk
should still fit in 1/4 of your heap.
• Garbage collection across all
projects will take longer, the more
repository data has to be
processed.
• Gerrit can handle at least 1TB of
total repository data easily.
Repository Size
27 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
How to count #fetch requests per day:
fgrep "git-upload-pack" sshd_log | wc –l
+
fgrep "git-upload-pack" httpd_log | wc -l
git-upload-pack
git fetch
fetch requests
git pull
git clone
What are the fetch/pull requests and how many will I have per day?
git-receive-pack
git push
push requests
28 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
• In most enterprise settings, push
requests contribute less than one
percent to the number of total
operations. Because of this, their
number can be typically
neglected.
Number Of Push
Requests
29 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 1: Get your numbers
• This is probably the most
important tuning factor. To
improve throughput, fetch
requests should be handled in
parallel, but parallel cloning needs
CPUs as well as memory.
• A Gerrit server optimized for
heavy load (32 cores, 32 GB RAM)
can handle about 1M fetch
requests per day, processing up to
50 in parallel.
Number Of Fetch
Requests
30 Copyright ©2015 CollabNet, Inc. All Rights Reserved.30 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
2. Size Your Hardware
S M
L
31 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 2: Size your hardware
100k requests/day
4 cores
4 GB RAM
S
500k requests/day
16 cores
16 GB RAM
M
1M requests/day
32 cores
32 GB RAM
L
32 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 2: Size your hardware
• Whenever horizontal scaling is not
cost efficient any more (> size L),
we recommend setting up another
server.
• If the number of repositories
exceeds 2500, a new server should
be used as well or reviews will get
painfully slow.
• Use Gerrit's replication feature to
synch repository content and
permissions to servers in different
geographies if network is the
limiting factor.
Number of Servers
33 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 2: Size your hardware
• The higher the network
bandwidth, the shorter it will take
to fetch and push repositories.
Depending on the average Git
repository size and number of
parallel requests, network
connectivity can will become the
primary bottleneck.
• Most enterprises have Gigabit
connections.
Network
34 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 2: Size your hardware
• Storage needs are determined by
the Git repository sizes.
• Fast storage (SSDs) really pay off
as git fetch, push and gc are all IO
heavy.
Disk Storage
35 Copyright ©2015 CollabNet, Inc. All Rights Reserved.35 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
3. Tune Your gerrit.config
36 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Timeout to process incoming
changes and update refs and
Gerrit changes
• Default 2min
receive.timeout
S
M
L
4 min
4 min
4 min
37 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Why ssh thread pooling is a good thing
38 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Threads to process ssh requests,
limiting the number of possible
parallel clones/pushes
• sshd.batchThreads will be
deducted from this number
• Defaults to 1.5 * <#Cores>
• Recommend
lim [sec(x)/sin(x)] * <#Cores>
x→π/4
= 2 * <#Cores>
sshd.threads
S
M
L
8
32
64
39 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Threads to process http
clone/push requests and review
related activities
• Default is 25
httpd.maxThreads
S
M
L
25
50
100
40 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• DB connections for Gerrit
• As a fetch/push request or a
review action can consume
multiple connections
• Recommend to set at least to
sshd.threads +
httpd.maxThreads
• Default is 8
database.poolLimit
S
M
L
50
150
250
41 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Maximum time before a DB
connections gets released
• As DB pool size is typically increased
from its default value, this parameter
should be too
• Default is 4
database.poolMaxIdle
S
M
L
16
16
16
42 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Java heap used for Gerrit. The
more repository data Gerrit can
cache in memory, the better
• Recommend to set at least to
<Cores> GB size heap size
allocated for Gerrit
• The largest repository on disk
should still fit in ¼ of your heap.
Our experience tells 32 GB per 1M
daily requests is pretty common
container.heapLimit
S
M
L
4g
16g
32g
43 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Maximum cache size to store Git
pack files in memory
• Default 10 MB is way too small if you
frequently clone large repositories
and like to cache their data
• Recommend ¼ of your heap size
core.packedGitLimit
S
M
L
1g
4g
8g
44 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Number of bytes of a pack file to
load into memory in a single read
operation
• 16k is a common choice
• Default is 8k
core.packedGitWindowSize
S
M
L
8k
16k
16k
45 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 3: Tune your gerrit.config
• Maximum number of pack files to
have open at once
• Too small number can cause
repository corruption during gc
• If you increase this to a larger
setting you may need to also adjust
the ulimit on file descriptors for the
host JVM, as Gerrit needs additional
file descriptors available for network
sockets and other repository data
manipulation
• Default is 128
core.packedGitOpenFiles
S
M
L
1024
2048
4096
46 Copyright ©2015 CollabNet, Inc. All Rights Reserved.46 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
4. Configure Garbage Collection
47 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 4: Configure garbage collection (~gerrit/.gitconfig)
• Determines how often Gerrit garbage
collection (JGit gc) is run across all
repositories
• Running JGit gc frequently is crucial for
good fetch/push performance as well as
a smooth source code browsing
experience
• JGit gc is more efficient than command
line git garbage collection and causes
less problems with Gerrit running in
parallel
• Parameters to control JGit gc's resource
consumption are in
~gerrit/.gitconfig Don't forget to
set gc.startTime for the initial garbage
collection time
gc.interval
S
M
L
1week
3 days
1 day
48 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 4: Configure garbage collection (~gerrit/.gitconfig)
• Threads used for Gerrit (JGit)
garbage collection
• ¼ <#Cores> is a common choice
pack.threads
S
M
L
1
4
8
49 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 4: Configure garbage collection (~gerrit/.gitconfig)
• Use this setting to control how
much memory (Java heap) is used
for Gerrit garbage collection (JGit
gc)
• ¼ of the configured Java heap is a
common choice
pack.windowMemory
S
M
L
1g
4g
8g
50 Copyright ©2015 CollabNet, Inc. All Rights Reserved.50 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
5. Deal With Heavy CI load
51 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with heavy CI load: Push vs Poll
Notify your CI push based (stream-events) instead of polling
update?
update?
update?
update!
update!
update!
Frequent polling Push based notification
52 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Use Jenkins Gerrit Trigger Plugin
53 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Use Jenkins Gerrit Trigger Plugin: Replication Config
54 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with heavy CI load: Segregation
Mark CI users as BATCH users and have a separate thread pool
CI Users
Resource
starvation
CI Users with
BATCH group
No Resource
starvation
55 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with heavy CI load
• Threads reserved to users in a Gerrit
group with the BATCH capability
• This allows to separate CI users
causing heavy load from human
users in different thread pools
• Recommend to set
Interactive users to have
<sshd.threads> - <sshd.batchThreads>
• This can improve clone/push
performance for human users
significantly)
• Default is 0
sshd.batchThreads
S
M
L
2
4
8
56 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with heavy CI load
• Threads used to process incoming
ssh connection requests
• Setting should only be adjusted if
you have CI system that create a
burst of connection requests in
parallel. Especially in AOSP build
environments, increasing this value
helped reducing the average wait
queue size
• Default is 2
sshd.commandStartThreads
S
M
L
2
3
5
57 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with heavy CI load: Replication
58 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with Heavy CI load (replication.config)
• Seconds to wait for network read or
write to complete before giving up.
• Especially in WAN environments,
don’t let this clog your replication
queue
• Default was 0 (unlimited)
remote.NAME.timeout
S
M
L
30
45
60
59 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Step 5: Deal with Heavy CI load (replication.config)
• Number of worker threads to
dedicate to pushing to the
repositories described by this
remote.
• The more threads, the lower the
chance get clogged by one
problematic repository
• Default is 1
remote.NAME.threads
S
M
L
2
4
8
60 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Follow Up Actions
• If you like our Cheat Sheet, share it: http://bit.ly/1kmpO7V
• Come up with an official “Gerrit T-Shirt Sizing” Approach
• Provide sample configurations for different T-Shirt sizes
• Adjust gerrit.config default options if completely off even for
small load
61 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
S M L
sshd
threads 1.5*<core> 8 32 64
batchThreads 0 2 4 8
commandstartThreads 2 2 3 5
httpd maxThreads 25 25 50 100
database
poolLimit 8 50 150 250
poolMaxIdle 4 16 16 16
core
packedGitLimit 10m 1g 4g 8g
packedGitWindowSize 8k 8k 16k 16k
packedGitOpenFiles 128 1024 2048 4096
container heapLimit - 4g 16g 32g
receive timeOut 2min 4min 4min 4min
Gerrit
Defaults
Summing up gerrit.config options
62 Copyright ©2015 CollabNet, Inc. All Rights Reserved.62 Copyright ©2015 CollabNet, Inc. All Rights Reserved.
Questions?
Johannes Nicolai
jnicolai@collab.net
www.collab.net
+1-650-228-2500
+1-888-778-9793
blogs.collab.net
twitter.com/collabnet
www.facebook.com/collabnet
www.linkedin.com/company/collabnet-inc

Weitere ähnliche Inhalte

Kürzlich hochgeladen

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Empfohlen

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Empfohlen (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Gerrit Performance Tuning Talk at Google Summit

  • 1. 1 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Google Gerrit User Summit 2015 How to Properly Tune and Size your Gerrit Backend Johannes Nicolai Director of Engineering, CollabNet Thursday, June 2, 2016
  • 2. 2 Copyright ©2015 CollabNet, Inc. All Rights Reserved. The CollabNet Gerrit Team
  • 3. 3 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Berlin Hackathon
  • 4. 4 Copyright ©2015 CollabNet, Inc. All Rights Reserved. I want you for the Gerrit Hackathon In Spring 2016!
  • 5. 5 Copyright ©2015 CollabNet, Inc. All Rights Reserved. TeamForge – A Full Development, Delivery and Collaboration System codecode planplan test test release release deploy deploy monitormonitor operate operate build build Governance, traceability, and IP security across tools, assets, processes, and teams Scalability planreview
  • 6. 6 Copyright ©2015 CollabNet, Inc. All Rights Reserved. CollabNet TeamForge – Integrated Tools
  • 7. 7 Copyright ©2015 CollabNet, Inc. All Rights Reserved. http://blogs.collab.net/git
  • 8. 8 Copyright ©2015 CollabNet, Inc. All Rights Reserved. CollabNet Customers & Some Gerrit Stats • Some of our customers have – More than 5 million Git fetch requests daily – More than 10 Gerrit master servers – More than 40 different “geographies” and replication servers – More than 100,000 active developers – More than 10 TB of source code – More than 20,000 repositories – More than 1000 CI servers Financial Healthcare Global Services Technology, Software and IoT Government and Aerospace
  • 9. 9 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Git Sizing / Performance Tuning - FAQ • How many servers will I need? • Which cloning protocols to offer? • How to set those gazillion gerrit.config options? • How many CPUs and how much RAM will I need? • What the heck is pack size? • How often should you run garbage collection? • Does it make any difference whether I go with a native Git or JGit based backend? • How do you handle hundreds of polling CI users without compromising performance for your human end users? • What about clustering and replication?
  • 10. 10 Copyright ©2015 CollabNet, Inc. All Rights Reserved. It Depends … One Size Fits All?
  • 11. 11 Copyright ©2015 CollabNet, Inc. All Rights Reserved. One Size does not fit all.
  • 12. 12 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Status Quo on Gerrit Performance Tuning Info • https://code.google.com/p/gerrit/wiki/Scaling • Gerrit Mailing list • Tech Talks • Some very generic, mentioning only upper limits – “some larger installations use 48 cores” – “at least one has 1 TB RAM” • Other very specific, requiring test env with production load – “Your luck may vary with tweaking your jvm gc parameters. You may find that increasing the size of the young generation may help drastically reduce the amount of gc thrashing your server performs.” • Depend on hard to measure metrics – # parallel fetch requests
  • 13. 13 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Typical ops persona • Jack of all trades • Responsible for dozens of applications • No Gerrit expert knowledge • No Java expert knowledge • Basic Git knowledge • No access to special HW • Limited test bed (not same load pattern as on production) • No access to Gerrit multi master / GFS technology • No overview about all their user base (tens of thousands of developers in different geographies)
  • 14. 14 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Challenge • Tuning advice that is – actionable – not “one size fits all” – targeted at ops people with no expert Gerrit / JVM knowledge – only uses easy to measure factors for its recommendations – does not require special HW or test beds – not depending on proprietary Gerrit extensions/technology • Keep Motivation up to go through all of this
  • 15. 15 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Disclaimer • Some advice shown next is debatable & over simplified • Eager for your feedback in Q&A session / lunch break • However all recommendations have been verified at our customers & our performance lab
  • 16. 16 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Motivation (TeamForge == Gerrit 2.10.6) Numbers from http://bit.ly/1WxSiw9
  • 17. 17 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Motivation
  • 18. 18 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Motivation
  • 19. 19 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Gerrit Performance Tuning in 5 Steps 1. Get your numbers 2. Size your hardware 3. Tune your gerrit.config 4. Configure Garbage collection 5. Deal with heavy CI load S M L
  • 20. 20 Copyright ©2015 CollabNet, Inc. All Rights Reserved.20 Copyright ©2015 CollabNet, Inc. All Rights Reserved. 1. Get Your Numbers
  • 21. 21 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers • The number of users is only an indirect factor for Gerrit tuning as most Git operations are done completely offline. • The more users you have, the more repositories and push/fetch requests you will probably encounter. • The majority of load is typically caused by build systems (CI). The biggest enterprise instance we have seen has 15k active users. Number Of Users
  • 22. 22 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers • The number of repositories (Gerrit projects) determines how much disk space you need. • We have seen instances with more than 10k repositories but would not recommend more than 2500 per server. Number Of Repositories
  • 23. 23 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers • ssh allows you to use public key cryptography which is stronger than passwords • ssh is recommended for CI users as this allows push based notifications (see step 5). • http(s) seems to perform better if the majority of the operation time is the connection request itself (not much data transferred, no heavy IO) • Hybrid approaches are possible Protocol
  • 24. 24 Copyright ©2015 CollabNet, Inc. All Rights Reserved. ssh vs https
  • 25. 25 Copyright ©2015 CollabNet, Inc. All Rights Reserved. ssh vs https
  • 26. 26 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers • Repository size determines the amount of storage you need on disk. In addition, it influences the needed memory during a clone request as pack files have to be loaded and streamed. • The largest repository on disk should still fit in 1/4 of your heap. • Garbage collection across all projects will take longer, the more repository data has to be processed. • Gerrit can handle at least 1TB of total repository data easily. Repository Size
  • 27. 27 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers How to count #fetch requests per day: fgrep "git-upload-pack" sshd_log | wc –l + fgrep "git-upload-pack" httpd_log | wc -l git-upload-pack git fetch fetch requests git pull git clone What are the fetch/pull requests and how many will I have per day? git-receive-pack git push push requests
  • 28. 28 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers • In most enterprise settings, push requests contribute less than one percent to the number of total operations. Because of this, their number can be typically neglected. Number Of Push Requests
  • 29. 29 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 1: Get your numbers • This is probably the most important tuning factor. To improve throughput, fetch requests should be handled in parallel, but parallel cloning needs CPUs as well as memory. • A Gerrit server optimized for heavy load (32 cores, 32 GB RAM) can handle about 1M fetch requests per day, processing up to 50 in parallel. Number Of Fetch Requests
  • 30. 30 Copyright ©2015 CollabNet, Inc. All Rights Reserved.30 Copyright ©2015 CollabNet, Inc. All Rights Reserved. 2. Size Your Hardware S M L
  • 31. 31 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 2: Size your hardware 100k requests/day 4 cores 4 GB RAM S 500k requests/day 16 cores 16 GB RAM M 1M requests/day 32 cores 32 GB RAM L
  • 32. 32 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 2: Size your hardware • Whenever horizontal scaling is not cost efficient any more (> size L), we recommend setting up another server. • If the number of repositories exceeds 2500, a new server should be used as well or reviews will get painfully slow. • Use Gerrit's replication feature to synch repository content and permissions to servers in different geographies if network is the limiting factor. Number of Servers
  • 33. 33 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 2: Size your hardware • The higher the network bandwidth, the shorter it will take to fetch and push repositories. Depending on the average Git repository size and number of parallel requests, network connectivity can will become the primary bottleneck. • Most enterprises have Gigabit connections. Network
  • 34. 34 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 2: Size your hardware • Storage needs are determined by the Git repository sizes. • Fast storage (SSDs) really pay off as git fetch, push and gc are all IO heavy. Disk Storage
  • 35. 35 Copyright ©2015 CollabNet, Inc. All Rights Reserved.35 Copyright ©2015 CollabNet, Inc. All Rights Reserved. 3. Tune Your gerrit.config
  • 36. 36 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Timeout to process incoming changes and update refs and Gerrit changes • Default 2min receive.timeout S M L 4 min 4 min 4 min
  • 37. 37 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Why ssh thread pooling is a good thing
  • 38. 38 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Threads to process ssh requests, limiting the number of possible parallel clones/pushes • sshd.batchThreads will be deducted from this number • Defaults to 1.5 * <#Cores> • Recommend lim [sec(x)/sin(x)] * <#Cores> x→π/4 = 2 * <#Cores> sshd.threads S M L 8 32 64
  • 39. 39 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Threads to process http clone/push requests and review related activities • Default is 25 httpd.maxThreads S M L 25 50 100
  • 40. 40 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • DB connections for Gerrit • As a fetch/push request or a review action can consume multiple connections • Recommend to set at least to sshd.threads + httpd.maxThreads • Default is 8 database.poolLimit S M L 50 150 250
  • 41. 41 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Maximum time before a DB connections gets released • As DB pool size is typically increased from its default value, this parameter should be too • Default is 4 database.poolMaxIdle S M L 16 16 16
  • 42. 42 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Java heap used for Gerrit. The more repository data Gerrit can cache in memory, the better • Recommend to set at least to <Cores> GB size heap size allocated for Gerrit • The largest repository on disk should still fit in ¼ of your heap. Our experience tells 32 GB per 1M daily requests is pretty common container.heapLimit S M L 4g 16g 32g
  • 43. 43 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Maximum cache size to store Git pack files in memory • Default 10 MB is way too small if you frequently clone large repositories and like to cache their data • Recommend ¼ of your heap size core.packedGitLimit S M L 1g 4g 8g
  • 44. 44 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Number of bytes of a pack file to load into memory in a single read operation • 16k is a common choice • Default is 8k core.packedGitWindowSize S M L 8k 16k 16k
  • 45. 45 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 3: Tune your gerrit.config • Maximum number of pack files to have open at once • Too small number can cause repository corruption during gc • If you increase this to a larger setting you may need to also adjust the ulimit on file descriptors for the host JVM, as Gerrit needs additional file descriptors available for network sockets and other repository data manipulation • Default is 128 core.packedGitOpenFiles S M L 1024 2048 4096
  • 46. 46 Copyright ©2015 CollabNet, Inc. All Rights Reserved.46 Copyright ©2015 CollabNet, Inc. All Rights Reserved. 4. Configure Garbage Collection
  • 47. 47 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 4: Configure garbage collection (~gerrit/.gitconfig) • Determines how often Gerrit garbage collection (JGit gc) is run across all repositories • Running JGit gc frequently is crucial for good fetch/push performance as well as a smooth source code browsing experience • JGit gc is more efficient than command line git garbage collection and causes less problems with Gerrit running in parallel • Parameters to control JGit gc's resource consumption are in ~gerrit/.gitconfig Don't forget to set gc.startTime for the initial garbage collection time gc.interval S M L 1week 3 days 1 day
  • 48. 48 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 4: Configure garbage collection (~gerrit/.gitconfig) • Threads used for Gerrit (JGit) garbage collection • ¼ <#Cores> is a common choice pack.threads S M L 1 4 8
  • 49. 49 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 4: Configure garbage collection (~gerrit/.gitconfig) • Use this setting to control how much memory (Java heap) is used for Gerrit garbage collection (JGit gc) • ¼ of the configured Java heap is a common choice pack.windowMemory S M L 1g 4g 8g
  • 50. 50 Copyright ©2015 CollabNet, Inc. All Rights Reserved.50 Copyright ©2015 CollabNet, Inc. All Rights Reserved. 5. Deal With Heavy CI load
  • 51. 51 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with heavy CI load: Push vs Poll Notify your CI push based (stream-events) instead of polling update? update? update? update! update! update! Frequent polling Push based notification
  • 52. 52 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Use Jenkins Gerrit Trigger Plugin
  • 53. 53 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Use Jenkins Gerrit Trigger Plugin: Replication Config
  • 54. 54 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with heavy CI load: Segregation Mark CI users as BATCH users and have a separate thread pool CI Users Resource starvation CI Users with BATCH group No Resource starvation
  • 55. 55 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with heavy CI load • Threads reserved to users in a Gerrit group with the BATCH capability • This allows to separate CI users causing heavy load from human users in different thread pools • Recommend to set Interactive users to have <sshd.threads> - <sshd.batchThreads> • This can improve clone/push performance for human users significantly) • Default is 0 sshd.batchThreads S M L 2 4 8
  • 56. 56 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with heavy CI load • Threads used to process incoming ssh connection requests • Setting should only be adjusted if you have CI system that create a burst of connection requests in parallel. Especially in AOSP build environments, increasing this value helped reducing the average wait queue size • Default is 2 sshd.commandStartThreads S M L 2 3 5
  • 57. 57 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with heavy CI load: Replication
  • 58. 58 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with Heavy CI load (replication.config) • Seconds to wait for network read or write to complete before giving up. • Especially in WAN environments, don’t let this clog your replication queue • Default was 0 (unlimited) remote.NAME.timeout S M L 30 45 60
  • 59. 59 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Step 5: Deal with Heavy CI load (replication.config) • Number of worker threads to dedicate to pushing to the repositories described by this remote. • The more threads, the lower the chance get clogged by one problematic repository • Default is 1 remote.NAME.threads S M L 2 4 8
  • 60. 60 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Follow Up Actions • If you like our Cheat Sheet, share it: http://bit.ly/1kmpO7V • Come up with an official “Gerrit T-Shirt Sizing” Approach • Provide sample configurations for different T-Shirt sizes • Adjust gerrit.config default options if completely off even for small load
  • 61. 61 Copyright ©2015 CollabNet, Inc. All Rights Reserved. S M L sshd threads 1.5*<core> 8 32 64 batchThreads 0 2 4 8 commandstartThreads 2 2 3 5 httpd maxThreads 25 25 50 100 database poolLimit 8 50 150 250 poolMaxIdle 4 16 16 16 core packedGitLimit 10m 1g 4g 8g packedGitWindowSize 8k 8k 16k 16k packedGitOpenFiles 128 1024 2048 4096 container heapLimit - 4g 16g 32g receive timeOut 2min 4min 4min 4min Gerrit Defaults Summing up gerrit.config options
  • 62. 62 Copyright ©2015 CollabNet, Inc. All Rights Reserved.62 Copyright ©2015 CollabNet, Inc. All Rights Reserved. Questions? Johannes Nicolai jnicolai@collab.net www.collab.net +1-650-228-2500 +1-888-778-9793 blogs.collab.net twitter.com/collabnet www.facebook.com/collabnet www.linkedin.com/company/collabnet-inc