SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
Linux locking mechanisms
Mark Veltzer
veltzer@gnu.org
Who am I?
● Linux kernel hacker
● Current maintainer of gnu grep(1)
● Free Source evangelist
● CTO of Hinbit
● Political philosopher (checkout my book “‫ןוטלשלטון‬
‫”ההמון‬ at book stores near you...)
● Jazz piano player
Why locking?
● To avoid race conditions in accessing shared memory.
● These occur because of: user space pre-emption which is
based on timer interrupts (userspace), multi-core (userspace),
interrupts in general (kernelspace), multi-core (kernelspace).
● Locking is not the only way to avoid such race conditions
● But this presentation is about locking and only about locking...
● In general locking is bad because it blocks your programs from
executing and so slows your program
● Avoid it when you can.
Avoiding locking - techniques
● Have each thread/CPU have it's own data.
● Use atomic operations (hardware) instead of locking (software).
● Lock free programming.
● RCU/COW.
● Readers/Writer locks.
● Not using the shared memory model but rather the actor model
for multi-processing/multi-threading.
● And many more techniques.
● Alas, we are here to talk about locking.
User space vs kernel space locking
● Is completely different
● Different mechanisms, different performance
considerations, different API
● But ultimately they work in concert.
User space locking
User space locking mechanisms
● Are not allowed to block interrupts. Ever!
● This is derived from the definition of what a secure operating system
is.
● If you have code in the kernel you can expose an API to user space
to block and allow interrupts.
● This is considered a bad idea.
● First of all because it allows user space bugs to lock up your
system.
● Second because it interferes with other kernel mechanisms (like
watchdogs, RCU and more).
● DON'T DO IT!
User space locking primitives
● pthread Spin lock
● Futex
● pthread mutex
● pthread Readers/writer lock.
● POSIX semaphore
● SYS V semaphore
User space spin lock - intro
● Is implemented as a simple TAS/CAS loop with CPU
relaxing and memory barrier.
● Pure user space implementation.
● DOES NOT DISABLE INTERRUPTS!
● Did I mention that it DOES NOT DISABLE
INTERRUPTS?!?
● It is interesting to note that IT DOES NOT DISABLE
INTERRUPTS.
● And finally note that NO INTERRUPTS ARE DISABLED
User space spin lock - issues
● The API is straight forward.
● The problem with this API is that IT DOES NOT
DISABLE INTERRUPTS
● This means that you may end up spinning for a
whole time slice (~1ms) if the two racing contexts are
on the same core.
● This may also happen if two context are on different
cores but one is pre-empted by some other context.
● This is really bad.
User space spin lock – when to
use?
● Use only when the two racing contexts are
running on two different cores and are the
highest priority contexts on these two cores.
● Usually this is only fulfilled on a dedicated RT
patched Linux system.
● Otherwise you get period spinning episodes.
● Kapish?!?
Futex – Fast user space locking
● The idea is to avoid trips to the kernel in the non
contended case.
● A mutex build half in user space and half in kernel
space.
● State of the lock is in user space.
● Wait list is in kernel space.
● Allows to lock/unlock without calling kernel space in
the non contended case.
● A Masterpiece of Linux engineering!
What happens when you die with a
lock held?
● Here are some suggestions:
– OS does nothing → deadlocks
– OS releases the lock → other contexts die because
of inconsistent data
– OS releases the lock and notifies the next context
locking the lock that the previous owner died →
This is what Linux does.
● This feature of locks is called robustness.
Linux has no threads
● Do you remember that Linux has no concept of a “thread”?
● Threads are just processes which happen to share a lot of
memory created with the clone(2) system call.
●
Don't tell this to user space developers in your company (they
tend to freak out about this).
●
This means that every locking mechanism in Linux can be
used for multi-processing as well as for multi-threading.
●
This is why futexes were made robust.
●
Futexes are robust by doing postmortem on dead processes and
examining the locks they leave behind in order to unlock them and
mark them as suspicious.
pthread_mutex
● Is now days just a wrapper for a futex.
● Could be used between processes (strange, but oh so true).
● Could be made robust using the undocumented API
pthread_mutexattr_setrobust(3).
● I found the documentation for this API on MSDN, of all
places…:)
● Supports recursiveness, two types of priority inheritance,
sharing between processes, priority ceiling and more.
● Makes lousy coffee, though...
Pthread readers/writer lock
● Is based on the futex.
● This means good performance.
● Standard, feature poor implementation.
● Build your own if you need more features.
● Could be used to synchronize processes and
threads.
POSIX semaphores
● Based on the futex.
● Again, good performance.
● Use this and not the Sys V version unless
you need the Sys V particular features.
● Could be used to synchronize both processes
and threads.
Sys V semaphore
● Reminder: Sys V is AT&T's version of UNIX
dating to circa 1983. In that version important
API's like this one were first introduced into the
UNIX world.
● Sys V semaphores are, however, crap.
● This is because they always go to the kernel.
Even in the non contended case.
● Do not use. Use POSIX semaphores instead.
Kernel locking
Kernel locking primitives
● Mutex
● Spinlock (3 types)
● Semaphore
● RW semaphores
Mutexes
● Go to sleep when finding the lock locked.
● This means they can only be used in contexts where you are
allowed to go to sleep.
● This means passive(user) context, kernel thread or workqueue
context and threaded IRQ context (?!?).
● Not allowed in IRQ handlers or tasklets.
● Has 3 modes: interruptible, killable and uninterruptible.
● Try to use interruptible as much as possible as bugs in kernel
code may cause non killable processes.
● Support priority inheritance under the RT patch.
Spin locks
● Most common kernel locking primitive.
● Are divided into 3 types: regular, BH and IRQ.
● Regular spin locks just turn off scheduling on the current CPU (in addition
to being a spin lock).
● BH turn off Bottom half mechanisms (including tasklets) on the current
CPU (in addition to being a spin lock).
● IRQ ones turn off interrupts on the local CPU (in addition to being a spin
lock).
● Sleeping, waiting or doing heavy computation with spin locks held is
considered reason for being banned from LKML.
● Turn into Mutexes under the RT patch and then support priority
inheritance.
Spin lock (irq version)
● Turning of IRQs is quite fast (IF, CLI, STI are
really fast on INTEL).
● Very brutal as it increases latency in real time
implementations.
● Try not to access data structure from IRQ
context so you won't have to use this.
● However, this is still one of the most common
locking primitives
When to use each?
● Passive vs Passive
– Use a mutex in interruptible mode (you are allowed to sleep in both).
– Or semaphore in interruptible mode.
– Or a regular spin lock. You are in no danger of spinning for long since scheduling on the current CPU is disabled. Interrupts may
come in and so do tasklets but these are quick.
●
Passive vs BH
– spinlock_bh
● Passive vs IRQ
– Use spin lock irq.
– The irq part prevents races on the current CPU.
– The spin lock part prevents races with other CPUs.
● BH vs BH
– Spinlock
● BH vs IRQ
– Spinlock irq
● IRQ vs IRQ
– Spinlock irq.
● See “Rusty Russells Unreliable Guide to Locking”
Semaphores
● semaphore.h
● Usually used as a mutex and not as a semaphore.
● Up and down methods do not accept ticket number but always
increase and decrease by 1.
● Ticket/permit count can be determined at creation time.
● Semaphores do not offer priority inheritance. Even under the
RT patch.
● This means that any system call that uses this is unfit to be
used in the critical path of a real time application.
● 3 modes of operation (like the mutex).
RW semaphores
● rwsem.h
● Offer more performance when number of readers
outnumbers number of writers.
● Again, does not support priority inheritance. Even
under the RT patch.
● Famous is the current->mm->mmap_sem that
protects each processes virtual memory description.
● Good reason not to use malloc(3) in real time
systems.
RW lock
● rwlock.h
● By Ingo Molnar (author of the real time patch).
● Supports priority inheritance.
● Use this instead of RW semaphores.
● Again, gives better performance when number
of readers out numbers number of writers.
The RT patch
● Runs all irq handlers in their own threads with
other interrupts enabled.
● Turns all spinlocks into mutexes to reduce
latency and allow high priority tasks to get the
CPU ASAP.
● If you really need a spin lock you can use the
raw spinlock API which will give you a true
spinlock even under the RT patch.

Weitere ähnliche Inhalte

Was ist angesagt?

Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFBrendan Gregg
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory ManagementNi Zo-Ma
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic ControlSUSE Labs Taipei
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux SystemJian-Hong Pan
 
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingFastBit Embedded Brain Academy
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBshimosawa
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdfAdrian Huang
 
Linux Preempt-RT Internals
Linux Preempt-RT InternalsLinux Preempt-RT Internals
Linux Preempt-RT Internals哲豪 康哲豪
 
Process Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelProcess Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelHaifeng Li
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 

Was ist angesagt? (20)

Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
 
Linux Internals - Interview essentials 2.0
Linux Internals - Interview essentials 2.0Linux Internals - Interview essentials 2.0
Linux Internals - Interview essentials 2.0
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux System
 
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-3 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
 
U-Boot - An universal bootloader
U-Boot - An universal bootloader U-Boot - An universal bootloader
U-Boot - An universal bootloader
 
Linux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKBLinux Kernel Booting Process (2) - For NLKB
Linux Kernel Booting Process (2) - For NLKB
 
Linux Internals - Interview essentials 4.0
Linux Internals - Interview essentials 4.0Linux Internals - Interview essentials 4.0
Linux Internals - Interview essentials 4.0
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Linux Preempt-RT Internals
Linux Preempt-RT InternalsLinux Preempt-RT Internals
Linux Preempt-RT Internals
 
Process Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelProcess Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux Kernel
 
Introduction to Perf
Introduction to PerfIntroduction to Perf
Introduction to Perf
 
USB Drivers
USB DriversUSB Drivers
USB Drivers
 
Linux Internals - Interview essentials - 1.0
Linux Internals - Interview essentials - 1.0Linux Internals - Interview essentials - 1.0
Linux Internals - Interview essentials - 1.0
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Basic Linux Internals
Basic Linux InternalsBasic Linux Internals
Basic Linux Internals
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 

Ähnlich wie Linux Locking Mechanisms

Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410huangachou
 
Linux kernel development chapter 10
Linux kernel development chapter 10Linux kernel development chapter 10
Linux kernel development chapter 10huangachou
 
Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]
Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]
Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]RootedCON
 
Describe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfDescribe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfexcellentmobiles
 
An Introduction to Locks in Go
An Introduction to Locks in GoAn Introduction to Locks in Go
An Introduction to Locks in GoYu-Shuan Hsieh
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Managementbasisspace
 
Concurrent/ parallel programming
Concurrent/ parallel programmingConcurrent/ parallel programming
Concurrent/ parallel programmingTausun Akhtary
 
Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Anthony Wong
 
epoll() - The I/O Hero
epoll() - The I/O Heroepoll() - The I/O Hero
epoll() - The I/O HeroMohsin Hijazee
 
Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...ScyllaDB
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Sneeker Yeh
 
An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)Robert Burrell Donkin
 
JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)
JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)
JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)PROIDEA
 

Ähnlich wie Linux Locking Mechanisms (20)

Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410Linux kernel development_ch9-10_20120410
Linux kernel development_ch9-10_20120410
 
Linux kernel development chapter 10
Linux kernel development chapter 10Linux kernel development chapter 10
Linux kernel development chapter 10
 
Realtime
RealtimeRealtime
Realtime
 
Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]
Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]
Jaime Peñalba - Kernel exploitation. ¿El octavo arte? [rooted2019]
 
Java under the hood
Java under the hoodJava under the hood
Java under the hood
 
Describe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfDescribe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdf
 
Kernel
KernelKernel
Kernel
 
An Introduction to Locks in Go
An Introduction to Locks in GoAn Introduction to Locks in Go
An Introduction to Locks in Go
 
Streams
StreamsStreams
Streams
 
Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
 
Multicore
MulticoreMulticore
Multicore
 
Concurrent/ parallel programming
Concurrent/ parallel programmingConcurrent/ parallel programming
Concurrent/ parallel programming
 
Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势
 
epoll() - The I/O Hero
epoll() - The I/O Heroepoll() - The I/O Hero
epoll() - The I/O Hero
 
Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...Keeping Latency Low and Throughput High with Application-level Priority Manag...
Keeping Latency Low and Throughput High with Application-level Priority Manag...
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
 
An End to Order
An End to OrderAn End to Order
An End to Order
 
Intro to operating_system
Intro to operating_systemIntro to operating_system
Intro to operating_system
 
An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)
 
JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)
JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)
JDD 2017: Brace yourself! Storm is coming! (Łukasz Gebel, Michał Koziorowski)
 

Mehr von Kernel TLV

Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution EnvironmentKernel TLV
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel TLV
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Kernel TLV
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityKernel TLV
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to BottomKernel TLV
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Kernel TLV
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and WhereKernel TLV
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernel TLV
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentKernel TLV
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesKernel TLV
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival GuideKernel TLV
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the BeastKernel TLV
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and DriversKernel TLV
 

Mehr von Kernel TLV (20)

DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution Environment
 
Fun with FUSE
Fun with FUSEFun with FUSE
Fun with FUSE
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem Security
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to Bottom
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and Where
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker Guidelines
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future Development
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the Beast
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
 

Kürzlich hochgeladen

WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2
 

Kürzlich hochgeladen (20)

WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 

Linux Locking Mechanisms

  • 1. Linux locking mechanisms Mark Veltzer veltzer@gnu.org
  • 2. Who am I? ● Linux kernel hacker ● Current maintainer of gnu grep(1) ● Free Source evangelist ● CTO of Hinbit ● Political philosopher (checkout my book “‫ןוטלשלטון‬ ‫”ההמון‬ at book stores near you...) ● Jazz piano player
  • 3. Why locking? ● To avoid race conditions in accessing shared memory. ● These occur because of: user space pre-emption which is based on timer interrupts (userspace), multi-core (userspace), interrupts in general (kernelspace), multi-core (kernelspace). ● Locking is not the only way to avoid such race conditions ● But this presentation is about locking and only about locking... ● In general locking is bad because it blocks your programs from executing and so slows your program ● Avoid it when you can.
  • 4. Avoiding locking - techniques ● Have each thread/CPU have it's own data. ● Use atomic operations (hardware) instead of locking (software). ● Lock free programming. ● RCU/COW. ● Readers/Writer locks. ● Not using the shared memory model but rather the actor model for multi-processing/multi-threading. ● And many more techniques. ● Alas, we are here to talk about locking.
  • 5. User space vs kernel space locking ● Is completely different ● Different mechanisms, different performance considerations, different API ● But ultimately they work in concert.
  • 7. User space locking mechanisms ● Are not allowed to block interrupts. Ever! ● This is derived from the definition of what a secure operating system is. ● If you have code in the kernel you can expose an API to user space to block and allow interrupts. ● This is considered a bad idea. ● First of all because it allows user space bugs to lock up your system. ● Second because it interferes with other kernel mechanisms (like watchdogs, RCU and more). ● DON'T DO IT!
  • 8. User space locking primitives ● pthread Spin lock ● Futex ● pthread mutex ● pthread Readers/writer lock. ● POSIX semaphore ● SYS V semaphore
  • 9. User space spin lock - intro ● Is implemented as a simple TAS/CAS loop with CPU relaxing and memory barrier. ● Pure user space implementation. ● DOES NOT DISABLE INTERRUPTS! ● Did I mention that it DOES NOT DISABLE INTERRUPTS?!? ● It is interesting to note that IT DOES NOT DISABLE INTERRUPTS. ● And finally note that NO INTERRUPTS ARE DISABLED
  • 10. User space spin lock - issues ● The API is straight forward. ● The problem with this API is that IT DOES NOT DISABLE INTERRUPTS ● This means that you may end up spinning for a whole time slice (~1ms) if the two racing contexts are on the same core. ● This may also happen if two context are on different cores but one is pre-empted by some other context. ● This is really bad.
  • 11. User space spin lock – when to use? ● Use only when the two racing contexts are running on two different cores and are the highest priority contexts on these two cores. ● Usually this is only fulfilled on a dedicated RT patched Linux system. ● Otherwise you get period spinning episodes. ● Kapish?!?
  • 12. Futex – Fast user space locking ● The idea is to avoid trips to the kernel in the non contended case. ● A mutex build half in user space and half in kernel space. ● State of the lock is in user space. ● Wait list is in kernel space. ● Allows to lock/unlock without calling kernel space in the non contended case. ● A Masterpiece of Linux engineering!
  • 13. What happens when you die with a lock held? ● Here are some suggestions: – OS does nothing → deadlocks – OS releases the lock → other contexts die because of inconsistent data – OS releases the lock and notifies the next context locking the lock that the previous owner died → This is what Linux does. ● This feature of locks is called robustness.
  • 14. Linux has no threads ● Do you remember that Linux has no concept of a “thread”? ● Threads are just processes which happen to share a lot of memory created with the clone(2) system call. ● Don't tell this to user space developers in your company (they tend to freak out about this). ● This means that every locking mechanism in Linux can be used for multi-processing as well as for multi-threading. ● This is why futexes were made robust. ● Futexes are robust by doing postmortem on dead processes and examining the locks they leave behind in order to unlock them and mark them as suspicious.
  • 15. pthread_mutex ● Is now days just a wrapper for a futex. ● Could be used between processes (strange, but oh so true). ● Could be made robust using the undocumented API pthread_mutexattr_setrobust(3). ● I found the documentation for this API on MSDN, of all places…:) ● Supports recursiveness, two types of priority inheritance, sharing between processes, priority ceiling and more. ● Makes lousy coffee, though...
  • 16. Pthread readers/writer lock ● Is based on the futex. ● This means good performance. ● Standard, feature poor implementation. ● Build your own if you need more features. ● Could be used to synchronize processes and threads.
  • 17. POSIX semaphores ● Based on the futex. ● Again, good performance. ● Use this and not the Sys V version unless you need the Sys V particular features. ● Could be used to synchronize both processes and threads.
  • 18. Sys V semaphore ● Reminder: Sys V is AT&T's version of UNIX dating to circa 1983. In that version important API's like this one were first introduced into the UNIX world. ● Sys V semaphores are, however, crap. ● This is because they always go to the kernel. Even in the non contended case. ● Do not use. Use POSIX semaphores instead.
  • 20. Kernel locking primitives ● Mutex ● Spinlock (3 types) ● Semaphore ● RW semaphores
  • 21. Mutexes ● Go to sleep when finding the lock locked. ● This means they can only be used in contexts where you are allowed to go to sleep. ● This means passive(user) context, kernel thread or workqueue context and threaded IRQ context (?!?). ● Not allowed in IRQ handlers or tasklets. ● Has 3 modes: interruptible, killable and uninterruptible. ● Try to use interruptible as much as possible as bugs in kernel code may cause non killable processes. ● Support priority inheritance under the RT patch.
  • 22. Spin locks ● Most common kernel locking primitive. ● Are divided into 3 types: regular, BH and IRQ. ● Regular spin locks just turn off scheduling on the current CPU (in addition to being a spin lock). ● BH turn off Bottom half mechanisms (including tasklets) on the current CPU (in addition to being a spin lock). ● IRQ ones turn off interrupts on the local CPU (in addition to being a spin lock). ● Sleeping, waiting or doing heavy computation with spin locks held is considered reason for being banned from LKML. ● Turn into Mutexes under the RT patch and then support priority inheritance.
  • 23. Spin lock (irq version) ● Turning of IRQs is quite fast (IF, CLI, STI are really fast on INTEL). ● Very brutal as it increases latency in real time implementations. ● Try not to access data structure from IRQ context so you won't have to use this. ● However, this is still one of the most common locking primitives
  • 24. When to use each? ● Passive vs Passive – Use a mutex in interruptible mode (you are allowed to sleep in both). – Or semaphore in interruptible mode. – Or a regular spin lock. You are in no danger of spinning for long since scheduling on the current CPU is disabled. Interrupts may come in and so do tasklets but these are quick. ● Passive vs BH – spinlock_bh ● Passive vs IRQ – Use spin lock irq. – The irq part prevents races on the current CPU. – The spin lock part prevents races with other CPUs. ● BH vs BH – Spinlock ● BH vs IRQ – Spinlock irq ● IRQ vs IRQ – Spinlock irq. ● See “Rusty Russells Unreliable Guide to Locking”
  • 25. Semaphores ● semaphore.h ● Usually used as a mutex and not as a semaphore. ● Up and down methods do not accept ticket number but always increase and decrease by 1. ● Ticket/permit count can be determined at creation time. ● Semaphores do not offer priority inheritance. Even under the RT patch. ● This means that any system call that uses this is unfit to be used in the critical path of a real time application. ● 3 modes of operation (like the mutex).
  • 26. RW semaphores ● rwsem.h ● Offer more performance when number of readers outnumbers number of writers. ● Again, does not support priority inheritance. Even under the RT patch. ● Famous is the current->mm->mmap_sem that protects each processes virtual memory description. ● Good reason not to use malloc(3) in real time systems.
  • 27. RW lock ● rwlock.h ● By Ingo Molnar (author of the real time patch). ● Supports priority inheritance. ● Use this instead of RW semaphores. ● Again, gives better performance when number of readers out numbers number of writers.
  • 28. The RT patch ● Runs all irq handlers in their own threads with other interrupts enabled. ● Turns all spinlocks into mutexes to reduce latency and allow high priority tasks to get the CPU ASAP. ● If you really need a spin lock you can use the raw spinlock API which will give you a true spinlock even under the RT patch.