SlideShare a Scribd company logo
1 of 53
Download to read offline
Design Choices of Golang
for High Scalability
SeongJae Park <sj38.park@gmail.com>
This work by SeongJae Park is licensed under the Creative
Commons Attribution-ShareAlike 3.0 Unported License. To
view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/3.0/.
These slides were presented during
GDG Seoul Meetup 201709
(https://www.meetup.com/GDG-Seoul/events/242054608/)
Nice To Meet You
SeongJae Park
sj38.park@gmail.com
Part time linux kernel programmer at KOSSLAB
What Makes Golang So Special on Multicore?
● People says Go is a good choice for high performance and scalability
● Why scalability is so important?
● Why existing solutions are not sufficient?
● What makes Go so special for the problems?
● TL; DR: Goroutines, Dynamic stack management, and Integrated Poller
DISCLAIMER: This talk is based on Dave Chenny’s OSCON15 presentation
(http://cdn.oreillystatic.com/en/assets/1/event/129/High%20performance%20servers%20without%20the%20event%20loop%20Presentation.pdf)
Why Scalability?
A long time ago, in a galaxy far, far away...
Moore’s Law
https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png
● Law: Number of transistors per square inch doubles roughly every 18 months
Moore’s Law
https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png
# of transistors
Single thread perf
Clock speed
Power (Watts)
Number of cores
● Law: Number of transistors per square inch doubles roughly every 18 months
● CPU vendors used the law to increase cpu clock speed; Only one thing that
programmers need to have for better performance was patience for free lunch
Moore’s Law
https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png
# of transistors
Single thread perf
Clock speed
Power (Watts)
Number of cores
● Law: Number of transistors per square inch doubles roughly every 18 months
● CPU vendors used the law to increase cpu clock speed; Only one thing that
programmers need to have for better performance was patience for free lunch
● However, CPU clock speed stopped to increase over a decade ago
Moore’s Law
https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png
# of transistors
Single thread perf
Clock speed
Power (Watts)
Number of cores
Why No Clock Speed?
https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
Why No Clock Speed?
● Electrons move between transistors for every clock
(Clock speed is analogous to switch on/off speed in below circuit diagram)
http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg
https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
Why No Clock Speed?
● Electrons move between transistors for every clock
(Clock speed is analogous to switch on/off speed in below circuit diagram)
● Moving a thing requires energy; We use electrical energy here
http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg
https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
Why No Clock Speed?
● Electrons move between transistors for every clock
(Clock speed is analogous to switch on/off speed in below circuit diagram)
● Moving a thing requires energy; We use electrical energy here
● Few of the electrical energy leaks from transformation to kinetic energy and
becomes heat energy; temperature goes high
http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg
https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
Why No Clock Speed?
● Electrons move between transistors for every clock
(Clock speed is analogous to switch on/off speed in below circuit diagram)
● Moving a thing requires energy; We use electrical energy here
● Few of the electrical energy leaks from transformation to kinetic energy and
becomes heat energy; temperature goes high
● High temperature damages CPU
http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg
https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
Why No Clock Speed?
● Electrons move between transistors for every clock
(Clock speed is analogous to switch on/off speed in below circuit diagram)
● Moving a thing requires energy; We use electrical energy here
● Few of the electrical energy leaks from transformation to kinetic energy and
becomes heat energy; temperature goes high
● High temperature damages CPU
● In short, increasing clock speed results in amplified power consumption, heat
dissipation, and CPU damage
http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg
https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
Moore’s Law is Still There, Vendors Are Changed
● In same clock speed, two 0.5-square inch processors would consume power
as similar as 1-square inch single processor
(Total distance of electrons movement per clock would be similar)
● Vendors now, thus, prefer to supply multi-core processors
http://happierhuman.wpengine.netdna-cdn.com/wp-content/uploads/2012/11/One-cookie-vs-two-cookies.jpg
Parallelism is Not Free
● Multi-core system cannot help zero-concurrency programs
● Just increasing concurrency does not guarantee proportional speedup;
Clumsy concurrency controls can make things even worse on multi-core
● Go has made important design choices for highly scalable concurrency
control. Remainder of this talk will describe some of the choices
https://img.devrant.io/devrant/rant/r_373632_a3SmV.jpg
Context Management
Process? Thread? Goroutine!
Resource Sharing and Context
● Concurrent tasks share processors and memory
(Number of tasks is usually larger than number of processors)
● To pause and resume an execution, need to manage context of the task
○ Context in this context: point to next instruction, stack frames, data in registers, ...
https://headguruteacher.files.wordpress.com/2017/05/x20142711071202qitokro-s8uda-pagespeed-ic-afnisfpvf0.jpg?w=640
Process: Analogous to a room for lease
● Abstraction of an execution of given program
● Process context switching require many expensive operations
https://www.youtube.com/watch?v=4OclkGRLuxw
Process: Analogous to a room for lease
● Abstraction of an execution of given program
● Process context switching require many expensive operations
○ Finding out a process to run next, management of waiting / pending processes
https://www.youtube.com/watch?v=4OclkGRLuxw
Process: Analogous to a room for lease
● Abstraction of an execution of given program
● Process context switching require many expensive operations
○ Finding out a process to run next, management of waiting / pending processes
○ Back-up of current all CPU registers, restore all CPU registers to last backup of next process
https://www.youtube.com/watch?v=4OclkGRLuxw
Process: Analogous to a room for lease
● Abstraction of an execution of given program
● Process context switching require many expensive operations
○ Finding out a process to run next, management of waiting / pending processes
○ Back-up of current all CPU registers, restore all CPU registers to last backup of next process
○ Flush virtual memory mapping cache (TLB)
https://www.youtube.com/watch?v=4OclkGRLuxw
Process: Analogous to a room for lease
● Abstraction of an execution of given program
● Process context switching require many expensive operations
○ Finding out a process to run next, management of waiting / pending processes
○ Back-up of current all CPU registers, restore all CPU registers to last backup of next process
○ Flush virtual memory mapping cache (TLB)
○ All above operations should be run in operating system kernel; it means context switch
between user mode and kernel mode
https://www.youtube.com/watch?v=4OclkGRLuxw
Thread: a.k.a Light-Weight Process
● Threads are similar with processes but they share address space
● Because of address space sharing, thread context is smaller than process
context; Thread is faster than process for creation and switching
● Still context switch overhead exists
https://www.topdraw.com/assets/uploads/2015/04/standing-desk.jpg
Goroutine
● Not thread, not coroutine, goroutine.
● Major primitive of Go for concurrent task execution
● Designed to have minimal context overhead only
http://edinburghopendata.info/wp-content/uploads/2015/05/141107-hackathon_18_d893499f2c13fe1fa05bd46252246b1e.jpg
Goroutine: Co-operative scheduling
● Cooperative scheduling minimizes context switching itself
● Goroutines do context switch only in well-defined situations
https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
Goroutine: Co-operative scheduling
● Cooperative scheduling minimizes context switching itself
● Goroutines do context switch only in well-defined situations
○ Channel send / receive operation
https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
Goroutine: Co-operative scheduling
● Cooperative scheduling minimizes context switching itself
● Goroutines do context switch only in well-defined situations
○ Channel send / receive operation
○ `go` statement
https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
Goroutine: Co-operative scheduling
● Cooperative scheduling minimizes context switching itself
● Goroutines do context switch only in well-defined situations
○ Channel send / receive operation
○ `go` statement
○ Blocking system calls (file or network I/O)
https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
Goroutine: Co-operative scheduling
● Cooperative scheduling minimizes context switching itself
● Goroutines do context switch only in well-defined situations
○ Channel send / receive operation
○ `go` statement
○ Blocking system calls (file or network I/O)
○ Garbage collection
https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
Goroutine: Co-operative scheduling
● Cooperative scheduling minimizes context switching itself
● Goroutines do context switch only in well-defined situations
○ Channel send / receive operation
○ `go` statement
○ Blocking system calls (file or network I/O)
○ Garbage collection
● If goroutines are not cooperative, starvation is possible
(https://gist.github.com/sjp38/dcdb6295e10f1cfe919b)
https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
Goroutine: Minimized Context
● In case of processes or threads, kernel should backup / restore entire
registers because kernel doesn’t know which registers are actually in use
https://i.pinimg.com/originals/c3/38/5f/c3385f909b2d2c36877f7ad02f841471.jpg
Goroutine: Minimized Context
● In case of processes or threads, kernel should backup / restore entire
registers because kernel doesn’t know which registers are actually in use
● Go compiler emit code for actually using register check and backup of them
for the every context switching event
https://i.pinimg.com/originals/c3/38/5f/c3385f909b2d2c36877f7ad02f841471.jpg http://www.cohoots.info/wp-content/uploads/2017/07/coworking-space-Co-Hoots.jpg
Goroutine: User-space scheduling
● M goroutines are multiplexed onto N kernel threads by user space go runtime
scheduler
● No transition between user mode and kernel mode
https://image.slidesharecdn.com/realtime-linux-140810101151-phpapp02/95/making-linux-do-hard-realtime-74-638.jpg?cb=1429570932
Goroutine: Minimized Context Switch Overhead
● Minimize context switching
● Minimize size of context
● No transition between user mode and kernel mode at all
● As a result, Tens of thousands of goroutines in a single process are the norm
https://github.com/ashleymcnamara/gophers/blob/master/GOPHER_SHARE.png
Stack Management
Finding optimal size of stack
● Stack is a storage for task’s call frame
○ Each call frame stores where to return, parameters, local variables
● Should not be overlapped with other concurrent task’s stack
Stack
Parameters,
Return address,
local variables
Stack
Frame
Pointer
Stack
Pointer
Stack Frame
High
Low
Stack grows downside
Stack Management of Threads
● Threads allocate fixed size stack memory when created
http://docs.roguewave.com/legacy-hpp/thrug/images/stackallocation.gif
Stack Management of Threads
● Threads allocate fixed size stack memory when created
● By default, 2 MiB On Linux/x86-32. With pthreads library NPTL
implementation, stack size can be specified in thread creation time
http://docs.roguewave.com/legacy-hpp/thrug/images/stackallocation.gif
Stack Management of Threads
● Threads allocate fixed size stack memory when created
● By default, 2 MiB On Linux/x86-32. With pthreads library NPTL
implementation, stack size can be specified in thread creation time
● Too large stack size could limit number of concurrent threads
http://docs.roguewave.com/legacy-hpp/thrug/images/stackallocation.gif
Stack Management of Goroutines
● Compiler knows how many stack size is required for a given function
● Goroutine starts with very small stack
● Just before a function call, Go checks whether current stack can commodate
the function’s stack size requirement; If not sufficient with current stack,
increase the stack size
● The stack can be shrinked, too
● As a result, goroutines can keep only necessary size of stack and allow
maximum concurrent goroutines
func f() {
g()
}
go func() {
f();
}()
Stack Management of Goroutines
● Compiler knows how many stack size is required for a given function
● Goroutine starts with very small stack
● Just before a function call, Go checks whether current stack can commodate
the function’s stack size requirement; If not sufficient with current stack,
increase the stack size
● The stack can be shrinked, too
● As a result, goroutines can keep only necessary size of stack and allow
maximum concurrent goroutines
func f() {
g()
}
go func() {
f();
}()
Compiler
f() requires 1KiB stack,
g() requires 1.5KiB stack
Stack Management of Goroutines
● Compiler knows how many stack size is required for a given function
● Goroutine starts with very small stack
● Just before a function call, Go checks whether current stack can commodate
the function’s stack size requirement; If not sufficient with current stack,
increase the stack size
● The stack can be shrinked, too
● As a result, goroutines can keep only necessary size of stack and allow
maximum concurrent goroutines
func f() {
g()
}
go func() {
f();
}()
Compiler
f() requires 1KiB stack,
g() requires 1.5KiB stack
Goroutine starts with 2KiB
stack
Stack Management of Goroutines
● Compiler knows how many stack size is required for a given function
● Goroutine starts with very small stack
● Just before a function call, Go checks whether current stack can commodate
the function’s stack size requirement; If not sufficient with current stack,
increase the stack size
● The stack can be shrinked, too
● As a result, goroutines can keep only necessary size of stack and allow
maximum concurrent goroutines
func f() {
g()
}
go func() {
f();
}()
Compiler
f() requires 1KiB stack,
g() requires 1.5KiB stack
Goroutine starts with 2KiB
stack
f() will use 1KiB. Current
stack (2KiB free) is enough
Stack Management of Goroutines
● Compiler knows how many stack size is required for a given function
● Goroutine starts with very small stack
● Just before a function call, Go checks whether current stack can commodate
the function’s stack size requirement; If not sufficient with current stack,
increase the stack size
● The stack can be shrinked, too
● As a result, goroutines can keep only necessary size of stack and allow
maximum concurrent goroutines
func f() {
g()
}
go func() {
f();
}()
Compiler
f() requires 1KiB stack,
g() requires 1.5KiB stack
Goroutine starts with 2KiB
stack
f() will use 1KiB. Current
stack (2KiB free) is enough
g() will use 1.5KiB. Current
stack (1KiB free) is not
enough. Allocate bigger stack!
C10K Problem
without EventLoop
Event? Threads? Goroutines and Integrated Poller!
C10K Problem
● How to hold 10,000 concurrent sessions
● 10,000 threads for 10,000 sessions would incur high overhead
● Event loop usually results in complex callback spaghetti code
https://www.youtube.com/watch?v=SgjAv1TnS5k
Integrated Poller: Goroutines Allocation
● Allocate 10,000 goroutines for 10,000 concurrent sessions;
Don’t worry, goroutine creation is fast enough;
tens of thousands of goroutines in single process is norm
● Goroutines waiting for events are just scheduled out
Go scheduler would not increase number of threads under the hood because
most of goroutines would scheduled out due to slow event completion time
https://github.com/ashleymcnamara/gophers/blob/master/GOPHER_MIC_DROP.png https://github.com/ashleymcnamara/gophers/blob/master/DRAWING_GOPHER.png
Integrated Poller: Polling and Scheduling
● Runtime of Go uses select / kqueue / epoll / IOCP to know which socket is
ready instead of the goroutine for the socket
● As runtime knows which goroutine is waiting for the socket, runtime put the
goroutine back on the same CPU as soon as the socket is ready
● In short, waiting for event and waking up appropriate goroutine is dedicated to
Go runtime while
● As a result, gophers can enjoy Simple programming model and Appropriate
context management overhead
https://talks.golang.org/2012/waza.slide#22
Conclusion
● Go is so special on multi-core system owing to its clever design choices
● Goroutine is super cheap, fast for context management
● Dynamic size stack management of goroutine allows more concurrency
● Integrated Poller in Go help gophers to have only benefit of threads and event
loop
https://github.com/ashleymcnamara/gophers/blob/master/GOPHER_LEARN.png
Thank You

More Related Content

What's hot

OPTEE on QEMU - Build Tutorial
OPTEE on QEMU - Build TutorialOPTEE on QEMU - Build Tutorial
OPTEE on QEMU - Build TutorialDalton Valadares
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement Linaro
 
Rh developers fat jar smackdown
Rh developers   fat jar smackdownRh developers   fat jar smackdown
Rh developers fat jar smackdownRed Hat Developers
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthScyllaDB
 
Get Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsScyllaDB
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
 
Whoops! I Rewrote It in Rust
Whoops! I Rewrote It in RustWhoops! I Rewrote It in Rust
Whoops! I Rewrote It in RustScyllaDB
 
Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...
Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...
Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...Fwdays
 
GraalVM - OpenSlava 2019-10-18
GraalVM - OpenSlava 2019-10-18GraalVM - OpenSlava 2019-10-18
GraalVM - OpenSlava 2019-10-18Jorge Hidalgo
 
G1: To Infinity and Beyond
G1: To Infinity and BeyondG1: To Infinity and Beyond
G1: To Infinity and BeyondScyllaDB
 
GraalVM - MadridJUG 2019-10-22
GraalVM - MadridJUG 2019-10-22GraalVM - MadridJUG 2019-10-22
GraalVM - MadridJUG 2019-10-22Jorge Hidalgo
 
Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Bo-Yi Wu
 
Make Their Short Film By Loris Greaud
Make Their Short Film By Loris GreaudMake Their Short Film By Loris Greaud
Make Their Short Film By Loris GreaudLoris Greaud
 
GraalVM - JBCNConf 2019-05-28
GraalVM - JBCNConf 2019-05-28GraalVM - JBCNConf 2019-05-28
GraalVM - JBCNConf 2019-05-28Jorge Hidalgo
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK toolsHaribabu Nandyal Padmanaban
 
Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...ScyllaDB
 
Embedded Recipes 2018 - swupdate: update your embedded device - Charles-Anto...
Embedded Recipes 2018 -  swupdate: update your embedded device - Charles-Anto...Embedded Recipes 2018 -  swupdate: update your embedded device - Charles-Anto...
Embedded Recipes 2018 - swupdate: update your embedded device - Charles-Anto...Anne Nicolas
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsMaarten Smeets
 
Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?ScyllaDB
 

What's hot (20)

OPTEE on QEMU - Build Tutorial
OPTEE on QEMU - Build TutorialOPTEE on QEMU - Build Tutorial
OPTEE on QEMU - Build Tutorial
 
BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement BUD17-218: Scheduler Load tracking update and improvement
BUD17-218: Scheduler Load tracking update and improvement
 
Rh developers fat jar smackdown
Rh developers   fat jar smackdownRh developers   fat jar smackdown
Rh developers fat jar smackdown
 
Using eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster HealthUsing eBPF to Measure the k8s Cluster Health
Using eBPF to Measure the k8s Cluster Health
 
Get Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java Applications
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Whoops! I Rewrote It in Rust
Whoops! I Rewrote It in RustWhoops! I Rewrote It in Rust
Whoops! I Rewrote It in Rust
 
Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...
Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...
Dmytro Patkovskyi "Practical tips regarding build optimization for those who ...
 
GraalVM - OpenSlava 2019-10-18
GraalVM - OpenSlava 2019-10-18GraalVM - OpenSlava 2019-10-18
GraalVM - OpenSlava 2019-10-18
 
G1: To Infinity and Beyond
G1: To Infinity and BeyondG1: To Infinity and Beyond
G1: To Infinity and Beyond
 
GraalVM - MadridJUG 2019-10-22
GraalVM - MadridJUG 2019-10-22GraalVM - MadridJUG 2019-10-22
GraalVM - MadridJUG 2019-10-22
 
Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署
 
Make Their Short Film By Loris Greaud
Make Their Short Film By Loris GreaudMake Their Short Film By Loris Greaud
Make Their Short Film By Loris Greaud
 
GraalVM - JBCNConf 2019-05-28
GraalVM - JBCNConf 2019-05-28GraalVM - JBCNConf 2019-05-28
GraalVM - JBCNConf 2019-05-28
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
 
Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...
 
Embedded Recipes 2018 - swupdate: update your embedded device - Charles-Anto...
Embedded Recipes 2018 -  swupdate: update your embedded device - Charles-Anto...Embedded Recipes 2018 -  swupdate: update your embedded device - Charles-Anto...
Embedded Recipes 2018 - swupdate: update your embedded device - Charles-Anto...
 
Docker & ci
Docker & ciDocker & ci
Docker & ci
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck Threads
 
Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?Rust Is Safe. But Is It Fast?
Rust Is Safe. But Is It Fast?
 

Similar to Design choices of golang for high scalability

Improve the deployment process step by step
Improve the deployment process step by stepImprove the deployment process step by step
Improve the deployment process step by stepDaniel Fahlke
 
Vietnam qa meetup
Vietnam qa meetupVietnam qa meetup
Vietnam qa meetupSyam Sasi
 
Concurrency - Why it's hard ?
Concurrency - Why it's hard ?Concurrency - Why it's hard ?
Concurrency - Why it's hard ?Ramith Jayasinghe
 
SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0Mike Belshe
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cAjith Narayanan
 
Rally--OpenStack Benchmarking at Scale
Rally--OpenStack Benchmarking at ScaleRally--OpenStack Benchmarking at Scale
Rally--OpenStack Benchmarking at ScaleMirantis
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesAlexander Penev
 
From nothing to a video under 2 seconds / Mikhail Sychev (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev (YouTube)Ontico
 
Cloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud PipelinesCloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud PipelinesLars Rosenquist
 
Cloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud PipelinesCloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud PipelinesLars Rosenquist
 
Stor c gregynog colloquium
Stor c   gregynog colloquiumStor c   gregynog colloquium
Stor c gregynog colloquiumgregynog
 
Web performance optimization - MercadoLibre
Web performance optimization - MercadoLibreWeb performance optimization - MercadoLibre
Web performance optimization - MercadoLibrePablo Moretti
 
Benchmarking for HTTP/2
Benchmarking for HTTP/2Benchmarking for HTTP/2
Benchmarking for HTTP/2Kit Chan
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in productionPingCAP
 
Async Web Frameworks in Python
Async Web Frameworks in PythonAsync Web Frameworks in Python
Async Web Frameworks in PythonRyan Johnson
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3LibbySchulze
 

Similar to Design choices of golang for high scalability (20)

Improve the deployment process step by step
Improve the deployment process step by stepImprove the deployment process step by step
Improve the deployment process step by step
 
Vietnam qa meetup
Vietnam qa meetupVietnam qa meetup
Vietnam qa meetup
 
Why Concurrency is hard ?
Why Concurrency is hard ?Why Concurrency is hard ?
Why Concurrency is hard ?
 
Concurrency - Why it's hard ?
Concurrency - Why it's hard ?Concurrency - Why it's hard ?
Concurrency - Why it's hard ?
 
Cinder Updates - Liberty Edition
Cinder Updates - Liberty Edition Cinder Updates - Liberty Edition
Cinder Updates - Liberty Edition
 
SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0
 
Performance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12cPerformance Tuning Oracle Weblogic Server 12c
Performance Tuning Oracle Weblogic Server 12c
 
Hot deploy
Hot deployHot deploy
Hot deploy
 
Rally--OpenStack Benchmarking at Scale
Rally--OpenStack Benchmarking at ScaleRally--OpenStack Benchmarking at Scale
Rally--OpenStack Benchmarking at Scale
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
From nothing to a video under 2 seconds / Mikhail Sychev (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)From nothing to a video under 2 seconds / Mikhail Sychev  (YouTube)
From nothing to a video under 2 seconds / Mikhail Sychev (YouTube)
 
Cloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud PipelinesCloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud Pipelines
 
Cloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud PipelinesCloud Native CI/CD with Spring Cloud Pipelines
Cloud Native CI/CD with Spring Cloud Pipelines
 
Stor c gregynog colloquium
Stor c   gregynog colloquiumStor c   gregynog colloquium
Stor c gregynog colloquium
 
Web performance optimization - MercadoLibre
Web performance optimization - MercadoLibreWeb performance optimization - MercadoLibre
Web performance optimization - MercadoLibre
 
Benchmarking for HTTP/2
Benchmarking for HTTP/2Benchmarking for HTTP/2
Benchmarking for HTTP/2
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in production
 
Async Web Frameworks in Python
Async Web Frameworks in PythonAsync Web Frameworks in Python
Async Web Frameworks in Python
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 

More from SeongJae Park

Biscuit: an operating system written in go
Biscuit:  an operating system written in goBiscuit:  an operating system written in go
Biscuit: an operating system written in goSeongJae Park
 
GCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorGCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorSeongJae Park
 
Linux Kernel Memory Model
Linux Kernel Memory ModelLinux Kernel Memory Model
Linux Kernel Memory ModelSeongJae Park
 
An Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelAn Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelSeongJae Park
 
Understanding of linux kernel memory model
Understanding of linux kernel memory modelUnderstanding of linux kernel memory model
Understanding of linux kernel memory modelSeongJae Park
 
Let the contribution begin (EST futures)
Let the contribution begin  (EST futures)Let the contribution begin  (EST futures)
Let the contribution begin (EST futures)SeongJae Park
 
Porting golang development environment developed with golang
Porting golang development environment developed with golangPorting golang development environment developed with golang
Porting golang development environment developed with golangSeongJae Park
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocatorSeongJae Park
 
An introduction to_golang.avi
An introduction to_golang.aviAn introduction to_golang.avi
An introduction to_golang.aviSeongJae Park
 
Develop Android/iOS app using golang
Develop Android/iOS app using golangDevelop Android/iOS app using golang
Develop Android/iOS app using golangSeongJae Park
 
Develop Android app using Golang
Develop Android app using GolangDevelop Android app using Golang
Develop Android app using GolangSeongJae Park
 
Sw install with_without_docker
Sw install with_without_dockerSw install with_without_docker
Sw install with_without_dockerSeongJae Park
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot publicSeongJae Park
 
(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.aviSeongJae Park
 
Deep dark-side of git: How git works internally
Deep dark-side of git: How git works internallyDeep dark-side of git: How git works internally
Deep dark-side of git: How git works internallySeongJae Park
 
Deep dark side of git - prologue
Deep dark side of git - prologueDeep dark side of git - prologue
Deep dark side of git - prologueSeongJae Park
 
DO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSDO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSSeongJae Park
 
Experimental android hacking using reflection
Experimental android hacking using reflectionExperimental android hacking using reflection
Experimental android hacking using reflectionSeongJae Park
 

More from SeongJae Park (20)

Biscuit: an operating system written in go
Biscuit:  an operating system written in goBiscuit:  an operating system written in go
Biscuit: an operating system written in go
 
GCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory AllocatorGCMA: Guaranteed Contiguous Memory Allocator
GCMA: Guaranteed Contiguous Memory Allocator
 
Linux Kernel Memory Model
Linux Kernel Memory ModelLinux Kernel Memory Model
Linux Kernel Memory Model
 
An Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux KernelAn Introduction to the Formalised Memory Model for Linux Kernel
An Introduction to the Formalised Memory Model for Linux Kernel
 
Understanding of linux kernel memory model
Understanding of linux kernel memory modelUnderstanding of linux kernel memory model
Understanding of linux kernel memory model
 
Let the contribution begin (EST futures)
Let the contribution begin  (EST futures)Let the contribution begin  (EST futures)
Let the contribution begin (EST futures)
 
Porting golang development environment developed with golang
Porting golang development environment developed with golangPorting golang development environment developed with golang
Porting golang development environment developed with golang
 
gcma: guaranteed contiguous memory allocator
gcma:  guaranteed contiguous memory allocatorgcma:  guaranteed contiguous memory allocator
gcma: guaranteed contiguous memory allocator
 
An introduction to_golang.avi
An introduction to_golang.aviAn introduction to_golang.avi
An introduction to_golang.avi
 
Develop Android/iOS app using golang
Develop Android/iOS app using golangDevelop Android/iOS app using golang
Develop Android/iOS app using golang
 
Develop Android app using Golang
Develop Android app using GolangDevelop Android app using Golang
Develop Android app using Golang
 
Sw install with_without_docker
Sw install with_without_dockerSw install with_without_docker
Sw install with_without_docker
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot public
 
(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi(Live) build and run golang web server on android.avi
(Live) build and run golang web server on android.avi
 
Deep dark-side of git: How git works internally
Deep dark-side of git: How git works internallyDeep dark-side of git: How git works internally
Deep dark-side of git: How git works internally
 
Deep dark side of git - prologue
Deep dark side of git - prologueDeep dark side of git - prologue
Deep dark side of git - prologue
 
DO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCSDO YOU WANT TO USE A VCS
DO YOU WANT TO USE A VCS
 
Experimental android hacking using reflection
Experimental android hacking using reflectionExperimental android hacking using reflection
Experimental android hacking using reflection
 
ash
ashash
ash
 
Hacktime for adk
Hacktime for adkHacktime for adk
Hacktime for adk
 

Recently uploaded

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 

Recently uploaded (20)

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 

Design choices of golang for high scalability

  • 1. Design Choices of Golang for High Scalability SeongJae Park <sj38.park@gmail.com>
  • 2. This work by SeongJae Park is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/.
  • 3. These slides were presented during GDG Seoul Meetup 201709 (https://www.meetup.com/GDG-Seoul/events/242054608/)
  • 4. Nice To Meet You SeongJae Park sj38.park@gmail.com Part time linux kernel programmer at KOSSLAB
  • 5. What Makes Golang So Special on Multicore? ● People says Go is a good choice for high performance and scalability ● Why scalability is so important? ● Why existing solutions are not sufficient? ● What makes Go so special for the problems? ● TL; DR: Goroutines, Dynamic stack management, and Integrated Poller DISCLAIMER: This talk is based on Dave Chenny’s OSCON15 presentation (http://cdn.oreillystatic.com/en/assets/1/event/129/High%20performance%20servers%20without%20the%20event%20loop%20Presentation.pdf)
  • 6. Why Scalability? A long time ago, in a galaxy far, far away...
  • 8. ● Law: Number of transistors per square inch doubles roughly every 18 months Moore’s Law https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png # of transistors Single thread perf Clock speed Power (Watts) Number of cores
  • 9. ● Law: Number of transistors per square inch doubles roughly every 18 months ● CPU vendors used the law to increase cpu clock speed; Only one thing that programmers need to have for better performance was patience for free lunch Moore’s Law https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png # of transistors Single thread perf Clock speed Power (Watts) Number of cores
  • 10. ● Law: Number of transistors per square inch doubles roughly every 18 months ● CPU vendors used the law to increase cpu clock speed; Only one thing that programmers need to have for better performance was patience for free lunch ● However, CPU clock speed stopped to increase over a decade ago Moore’s Law https://www.karlrupp.net/wp-content/uploads/2015/06/35years.png # of transistors Single thread perf Clock speed Power (Watts) Number of cores
  • 11. Why No Clock Speed? https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
  • 12. Why No Clock Speed? ● Electrons move between transistors for every clock (Clock speed is analogous to switch on/off speed in below circuit diagram) http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
  • 13. Why No Clock Speed? ● Electrons move between transistors for every clock (Clock speed is analogous to switch on/off speed in below circuit diagram) ● Moving a thing requires energy; We use electrical energy here http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
  • 14. Why No Clock Speed? ● Electrons move between transistors for every clock (Clock speed is analogous to switch on/off speed in below circuit diagram) ● Moving a thing requires energy; We use electrical energy here ● Few of the electrical energy leaks from transformation to kinetic energy and becomes heat energy; temperature goes high http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
  • 15. Why No Clock Speed? ● Electrons move between transistors for every clock (Clock speed is analogous to switch on/off speed in below circuit diagram) ● Moving a thing requires energy; We use electrical energy here ● Few of the electrical energy leaks from transformation to kinetic energy and becomes heat energy; temperature goes high ● High temperature damages CPU http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
  • 16. Why No Clock Speed? ● Electrons move between transistors for every clock (Clock speed is analogous to switch on/off speed in below circuit diagram) ● Moving a thing requires energy; We use electrical energy here ● Few of the electrical energy leaks from transformation to kinetic energy and becomes heat energy; temperature goes high ● High temperature damages CPU ● In short, increasing clock speed results in amplified power consumption, heat dissipation, and CPU damage http://fourthgradespace.weebly.com/uploads/1/3/3/9/13397069/2935717_orig.jpg https://i.ytimg.com/vi/9S9vP2inD_U/maxresdefault.jpg
  • 17. Moore’s Law is Still There, Vendors Are Changed ● In same clock speed, two 0.5-square inch processors would consume power as similar as 1-square inch single processor (Total distance of electrons movement per clock would be similar) ● Vendors now, thus, prefer to supply multi-core processors http://happierhuman.wpengine.netdna-cdn.com/wp-content/uploads/2012/11/One-cookie-vs-two-cookies.jpg
  • 18. Parallelism is Not Free ● Multi-core system cannot help zero-concurrency programs ● Just increasing concurrency does not guarantee proportional speedup; Clumsy concurrency controls can make things even worse on multi-core ● Go has made important design choices for highly scalable concurrency control. Remainder of this talk will describe some of the choices https://img.devrant.io/devrant/rant/r_373632_a3SmV.jpg
  • 20. Resource Sharing and Context ● Concurrent tasks share processors and memory (Number of tasks is usually larger than number of processors) ● To pause and resume an execution, need to manage context of the task ○ Context in this context: point to next instruction, stack frames, data in registers, ... https://headguruteacher.files.wordpress.com/2017/05/x20142711071202qitokro-s8uda-pagespeed-ic-afnisfpvf0.jpg?w=640
  • 21. Process: Analogous to a room for lease ● Abstraction of an execution of given program ● Process context switching require many expensive operations https://www.youtube.com/watch?v=4OclkGRLuxw
  • 22. Process: Analogous to a room for lease ● Abstraction of an execution of given program ● Process context switching require many expensive operations ○ Finding out a process to run next, management of waiting / pending processes https://www.youtube.com/watch?v=4OclkGRLuxw
  • 23. Process: Analogous to a room for lease ● Abstraction of an execution of given program ● Process context switching require many expensive operations ○ Finding out a process to run next, management of waiting / pending processes ○ Back-up of current all CPU registers, restore all CPU registers to last backup of next process https://www.youtube.com/watch?v=4OclkGRLuxw
  • 24. Process: Analogous to a room for lease ● Abstraction of an execution of given program ● Process context switching require many expensive operations ○ Finding out a process to run next, management of waiting / pending processes ○ Back-up of current all CPU registers, restore all CPU registers to last backup of next process ○ Flush virtual memory mapping cache (TLB) https://www.youtube.com/watch?v=4OclkGRLuxw
  • 25. Process: Analogous to a room for lease ● Abstraction of an execution of given program ● Process context switching require many expensive operations ○ Finding out a process to run next, management of waiting / pending processes ○ Back-up of current all CPU registers, restore all CPU registers to last backup of next process ○ Flush virtual memory mapping cache (TLB) ○ All above operations should be run in operating system kernel; it means context switch between user mode and kernel mode https://www.youtube.com/watch?v=4OclkGRLuxw
  • 26. Thread: a.k.a Light-Weight Process ● Threads are similar with processes but they share address space ● Because of address space sharing, thread context is smaller than process context; Thread is faster than process for creation and switching ● Still context switch overhead exists https://www.topdraw.com/assets/uploads/2015/04/standing-desk.jpg
  • 27. Goroutine ● Not thread, not coroutine, goroutine. ● Major primitive of Go for concurrent task execution ● Designed to have minimal context overhead only http://edinburghopendata.info/wp-content/uploads/2015/05/141107-hackathon_18_d893499f2c13fe1fa05bd46252246b1e.jpg
  • 28. Goroutine: Co-operative scheduling ● Cooperative scheduling minimizes context switching itself ● Goroutines do context switch only in well-defined situations https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
  • 29. Goroutine: Co-operative scheduling ● Cooperative scheduling minimizes context switching itself ● Goroutines do context switch only in well-defined situations ○ Channel send / receive operation https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
  • 30. Goroutine: Co-operative scheduling ● Cooperative scheduling minimizes context switching itself ● Goroutines do context switch only in well-defined situations ○ Channel send / receive operation ○ `go` statement https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
  • 31. Goroutine: Co-operative scheduling ● Cooperative scheduling minimizes context switching itself ● Goroutines do context switch only in well-defined situations ○ Channel send / receive operation ○ `go` statement ○ Blocking system calls (file or network I/O) https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
  • 32. Goroutine: Co-operative scheduling ● Cooperative scheduling minimizes context switching itself ● Goroutines do context switch only in well-defined situations ○ Channel send / receive operation ○ `go` statement ○ Blocking system calls (file or network I/O) ○ Garbage collection https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
  • 33. Goroutine: Co-operative scheduling ● Cooperative scheduling minimizes context switching itself ● Goroutines do context switch only in well-defined situations ○ Channel send / receive operation ○ `go` statement ○ Blocking system calls (file or network I/O) ○ Garbage collection ● If goroutines are not cooperative, starvation is possible (https://gist.github.com/sjp38/dcdb6295e10f1cfe919b) https://renegadeinc.com/wp-content/uploads/2016/05/RInc-Cooperation-1969.jpg
  • 34. Goroutine: Minimized Context ● In case of processes or threads, kernel should backup / restore entire registers because kernel doesn’t know which registers are actually in use https://i.pinimg.com/originals/c3/38/5f/c3385f909b2d2c36877f7ad02f841471.jpg
  • 35. Goroutine: Minimized Context ● In case of processes or threads, kernel should backup / restore entire registers because kernel doesn’t know which registers are actually in use ● Go compiler emit code for actually using register check and backup of them for the every context switching event https://i.pinimg.com/originals/c3/38/5f/c3385f909b2d2c36877f7ad02f841471.jpg http://www.cohoots.info/wp-content/uploads/2017/07/coworking-space-Co-Hoots.jpg
  • 36. Goroutine: User-space scheduling ● M goroutines are multiplexed onto N kernel threads by user space go runtime scheduler ● No transition between user mode and kernel mode https://image.slidesharecdn.com/realtime-linux-140810101151-phpapp02/95/making-linux-do-hard-realtime-74-638.jpg?cb=1429570932
  • 37. Goroutine: Minimized Context Switch Overhead ● Minimize context switching ● Minimize size of context ● No transition between user mode and kernel mode at all ● As a result, Tens of thousands of goroutines in a single process are the norm https://github.com/ashleymcnamara/gophers/blob/master/GOPHER_SHARE.png
  • 39. ● Stack is a storage for task’s call frame ○ Each call frame stores where to return, parameters, local variables ● Should not be overlapped with other concurrent task’s stack Stack Parameters, Return address, local variables Stack Frame Pointer Stack Pointer Stack Frame High Low Stack grows downside
  • 40. Stack Management of Threads ● Threads allocate fixed size stack memory when created http://docs.roguewave.com/legacy-hpp/thrug/images/stackallocation.gif
  • 41. Stack Management of Threads ● Threads allocate fixed size stack memory when created ● By default, 2 MiB On Linux/x86-32. With pthreads library NPTL implementation, stack size can be specified in thread creation time http://docs.roguewave.com/legacy-hpp/thrug/images/stackallocation.gif
  • 42. Stack Management of Threads ● Threads allocate fixed size stack memory when created ● By default, 2 MiB On Linux/x86-32. With pthreads library NPTL implementation, stack size can be specified in thread creation time ● Too large stack size could limit number of concurrent threads http://docs.roguewave.com/legacy-hpp/thrug/images/stackallocation.gif
  • 43. Stack Management of Goroutines ● Compiler knows how many stack size is required for a given function ● Goroutine starts with very small stack ● Just before a function call, Go checks whether current stack can commodate the function’s stack size requirement; If not sufficient with current stack, increase the stack size ● The stack can be shrinked, too ● As a result, goroutines can keep only necessary size of stack and allow maximum concurrent goroutines func f() { g() } go func() { f(); }()
  • 44. Stack Management of Goroutines ● Compiler knows how many stack size is required for a given function ● Goroutine starts with very small stack ● Just before a function call, Go checks whether current stack can commodate the function’s stack size requirement; If not sufficient with current stack, increase the stack size ● The stack can be shrinked, too ● As a result, goroutines can keep only necessary size of stack and allow maximum concurrent goroutines func f() { g() } go func() { f(); }() Compiler f() requires 1KiB stack, g() requires 1.5KiB stack
  • 45. Stack Management of Goroutines ● Compiler knows how many stack size is required for a given function ● Goroutine starts with very small stack ● Just before a function call, Go checks whether current stack can commodate the function’s stack size requirement; If not sufficient with current stack, increase the stack size ● The stack can be shrinked, too ● As a result, goroutines can keep only necessary size of stack and allow maximum concurrent goroutines func f() { g() } go func() { f(); }() Compiler f() requires 1KiB stack, g() requires 1.5KiB stack Goroutine starts with 2KiB stack
  • 46. Stack Management of Goroutines ● Compiler knows how many stack size is required for a given function ● Goroutine starts with very small stack ● Just before a function call, Go checks whether current stack can commodate the function’s stack size requirement; If not sufficient with current stack, increase the stack size ● The stack can be shrinked, too ● As a result, goroutines can keep only necessary size of stack and allow maximum concurrent goroutines func f() { g() } go func() { f(); }() Compiler f() requires 1KiB stack, g() requires 1.5KiB stack Goroutine starts with 2KiB stack f() will use 1KiB. Current stack (2KiB free) is enough
  • 47. Stack Management of Goroutines ● Compiler knows how many stack size is required for a given function ● Goroutine starts with very small stack ● Just before a function call, Go checks whether current stack can commodate the function’s stack size requirement; If not sufficient with current stack, increase the stack size ● The stack can be shrinked, too ● As a result, goroutines can keep only necessary size of stack and allow maximum concurrent goroutines func f() { g() } go func() { f(); }() Compiler f() requires 1KiB stack, g() requires 1.5KiB stack Goroutine starts with 2KiB stack f() will use 1KiB. Current stack (2KiB free) is enough g() will use 1.5KiB. Current stack (1KiB free) is not enough. Allocate bigger stack!
  • 48. C10K Problem without EventLoop Event? Threads? Goroutines and Integrated Poller!
  • 49. C10K Problem ● How to hold 10,000 concurrent sessions ● 10,000 threads for 10,000 sessions would incur high overhead ● Event loop usually results in complex callback spaghetti code https://www.youtube.com/watch?v=SgjAv1TnS5k
  • 50. Integrated Poller: Goroutines Allocation ● Allocate 10,000 goroutines for 10,000 concurrent sessions; Don’t worry, goroutine creation is fast enough; tens of thousands of goroutines in single process is norm ● Goroutines waiting for events are just scheduled out Go scheduler would not increase number of threads under the hood because most of goroutines would scheduled out due to slow event completion time https://github.com/ashleymcnamara/gophers/blob/master/GOPHER_MIC_DROP.png https://github.com/ashleymcnamara/gophers/blob/master/DRAWING_GOPHER.png
  • 51. Integrated Poller: Polling and Scheduling ● Runtime of Go uses select / kqueue / epoll / IOCP to know which socket is ready instead of the goroutine for the socket ● As runtime knows which goroutine is waiting for the socket, runtime put the goroutine back on the same CPU as soon as the socket is ready ● In short, waiting for event and waking up appropriate goroutine is dedicated to Go runtime while ● As a result, gophers can enjoy Simple programming model and Appropriate context management overhead https://talks.golang.org/2012/waza.slide#22
  • 52. Conclusion ● Go is so special on multi-core system owing to its clever design choices ● Goroutine is super cheap, fast for context management ● Dynamic size stack management of goroutine allows more concurrency ● Integrated Poller in Go help gophers to have only benefit of threads and event loop https://github.com/ashleymcnamara/gophers/blob/master/GOPHER_LEARN.png