SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Linux kernel TLV meetup, 28.02.2016
kerneltlv.com
Kfir Gollan
 What is the Berkeley Packet Filter?
 Writing BPF filters
 Debuging BPF
 Using BPF in user-space applications
 Advanced features of BPF
 BPF at its base is a way to perform fast packet filtering
at the kernel level.
 Filters are defined by user space
 Filters are executed in the kernel
 Invented by Steven McCanne and Van Jacobson in
1990. First publication at December 1992.
 Support for BPF in Linux was added by Jay Schulist for
the 2.5 development kernel.
 Many features were added later on.
For example JIT for BPF was added for the 3.0 kernel
(2011)
 We want to be able to filter packets at the kernel level
 Discard irrelevant packets at the kernel without copying
them to user space.
 The performance gains when using promiscuous mode
are substantial.
 We want to change filters dynamically without
recompiling the kernel or using a custom kernel
module.
 We want the filters to be architecture independent.
 BPF defines a set of operations that can be performed
on the filtered packet. Each operation gets its own
opcode.
 BPF was designed to be protocol indepented, as such it
treats the packets as raw buffers. It is up to the filter
writer to parse the needed packet headers.
 BPF is based on three building blocks:
 A: A 32 bit wide accumulator
 X: A 32 bit wide index register
 M[]: 16 x 32 bit wide misc registers aka “scratch
memory”
 LOAD: copy a value into the accumulator or index
register.
 STORE: copy either the accumulator or index
register to the scratch memory.
 ALU: perform arithmetic or logic operation on
the accumulator register.
 BRANCH: alter the flow of control.
 RETURN: terminate the filter and indicate what
portion of the packet to save.
 MISC: various operations that doesn’t match the
other types.
jf: 8jt: 8opcode: 16
k: 32
 Each instruction is represented by 64 bits.
 opcode 16 bits, indicates the instruction type.
 jt 8 bits, offset of the next instruction for true
case (“if” block) – Jump True.
 jf 8 bits, offset of the next instruction for false
case (“else” block) – Jump False.
 k 32 bits, generic data, used for various
purposes.
DescriptionInstruction
Load word into Ald
Load half-word into Aldh
Load byte into Aldb
Load word into Xldx
Load byte into Xldxb
Store A into M[]st
Store X into M[]stx
Jump to labeljmp
Jump on k == Ajeq
Jump on k > Ajgt
Jump on k >= Ajge
Jump on k & Ajset
DescriptionInstruction
A + <x>add
A - <x>sub
A * <x>mul
A / <x>div
A & <x>and
A | <x>or
A ^ <x>xor
A << <x>lsh
A >> <x>rsh
Returnret
Copy A into Xtax
Copy X into Atxa
DescriptionMode
The literal value stored in k.#k
The length of the packet.#len
The word at offset k in the scratch
memory store.
M [ k ]
The byte, half-word or word at byte
offset k in the packet.
[ k ]
The byte, half-word or word at byte
offset x+k in the packet.
[ x + k ]
Jump to label LL
Jump to lt if true, otherwise jump to l f#k, lt. lf
The index registerx
Four times the value of the low four bits
of the byte at the offset k in the packet
4 * ([k] & 0xf)
 MAC header
 IP header
TypeSource MACDestination MAC
2 bytes6 bytes6 bytes
 To catch all the IP packets over MAC we need to check
the type field in the MAC header.
ldh [12]
jeq #ETHERTYPE_IP, L1, L2
L1: ret #-1
L2: ret #0
 Load half-word (2 bytes) from offset 12 of the packet
(the type field) into A register.
 Check if A register is equal to #ETHERTYPE_IP
 If true -> return #-1
 If false -> return 0
ldh [12] ; A <= ether.type
jeq #ETHERTYPE_IP, L1, L3 ; A == #ETHERTYPE_IP ?
L1: ld [26] ; A = ip.src
and #0xffffff00 ; A = A & 0xffffff00
jeq #0x80037000, L3, L2 ; A == 128.3.112.0
L2: ret #-1
L3: ret #0
 Check if the packet type is IP
 “Remove” the lower byte of the src IP
 Check if the src IP matches 128.3.112.X
 A utility used to create bpf binary code (bpf
“assembly” compiler”).
 Part of the mainline kernel. tools/net/bpf_asm.c
 Supports two output formats
 c style output
{ 0x28, 0, 0, 0x0000000c },
{ 0x15, 0, 1, 0x00000800 },
{ 0x06, 0, 0, 0xffffffff },
{ 0x06, 0, 0, 0000000000 },
 raw output
4,40 0 0 12,21 0 1 2048,6 0 0 4294967295,6 0 0 0,
 A utility used to debug bpf filters
 Part of the mainline kernel. tools/net/bpf_dbg.c
 Main features
 pcap files as input for filters.
 bpf-asm raw output format for bpf filter definition.
 single-stepping through filters
 breakpoints
 internal status (A,X,M, PC)
 disassemble raw bpf to bpf-asm
struct sock_filter { /* Filter block */
__u16 code; /* Actual filter code */
__u8 jt; /* Jump true */
__u8 jf; /* Jump false */
__u32 k; /* Generic multiuse field */
};
 Filter block is in fact a single BPF instruction.
 Used to pass a filter specifications to the kernel.
struct sock_filter code[] = {
{ 0x28, 0, 0, 0x0000000c },
{ 0x15, 0, 8, 0x000086dd },
{ 0x30, 0, 0, 0x00000014 },
…
};
struct sock_fprog { /* Required for SO_ATTACH_FILTER. */
unsigned short len; /* Number of filter blocks */
struct sock_filter *filter; /* Filter blocks list */
};
 A parameter for setsockopt that allows to attach a filter
to a socket.
struct sock_fprog bpf = {…};
setsockopt(int fd,
SOL_SOCKET,
SO_ATTACH_FILTER,
&bpf.
sizeof(bpf));
 SO_ATTACH_FILTER
Attach a BPF filter to a socket.
Note: only a single filter can be attached at a given
time.
 SO_DETACH_FILTER
Remove the currently attached filter from the socket.
 SO_LOCK_FILTER
Lock a filter on a socket. This is useful for setting a
filter and then dropping privileges.
 Example: create a raw socket, apply a filter, lock it, drop
CAP_NET_RAW
 Choosing the correct flags to the socket is critical for
making the filter work properly.
 The filtered buffer will start in the wanted location in the net
stack.
 Selecting the socket domain
 AF_PACKET – filtering at L2 (e.g ethernet)
 AF_INET – IPv4 filtering
 AF_INET6 – IPv6 filtering
 Selecting the socket type
 SOCK_RAW – raw filtering, no headers are handled by the
kernel
 SOCK_STREAM/SOCK_DGRAM etc – headers are handled
by the kernel
 libpcap – Packet CAPture library
 Provides an easy to use api for packet filtering.
 Supports a high level filtering format which is
converted to BPF.
 pcap_compile – create a bpf filter
 pcap_setfilter – attach a filter
 Look at man 7 pcap-filter for a detailed description.
ether dst [mac address]
dst net [ip address]
dst portrange [port1]-[port2]
 user-space packet sniffing program.
 Uses bpf for kernel level filtering (based on pcap)
 Dump the generated bpf filter
 -d bpf asm format
 -dd c format
 -ddd bpf raw format
$ sudo tcpdump -d "ip and udp“
(000) ldh [12]
(001) jeq #0x800 jt 2 jf 5
(002) ldb [23]
(003) jeq #0x11 jt 4 jf 5
(004) ret #65535
(005) ret #0
 A just-in-time BPF instruction translation.
 Note that the translation is performed when attaching
the filter via bpf_jit_compile(..).
 BPF instructions are mapped directly to architecture
depended instructions.
 BPF registers are mapped to machine physical registers
 Provides a performance gain
 About 50ns per packet for simple filters (E5540 @
2.53GHz).
The more complex the filter the better performance
gains from JIT.
 Supported on x86,x86-64, powerpc. arm and more.
 Each BPF filter is verified before attaching it to a
socket.
 This is critical, the filters come from userspace!
 The following rules are enforced
 The filter must not contain references or jumps that are
out of range.
 The filter must contain only valid BPF opcodes.
 The filter must end with RET opcode.
 All jumps are forward – loops are not allowed!
 The verification is implemented at sk_chk_filter
function in net/core/filter.c (until kernel 3.16),
modified after adding seccomp.
 The Linux kernel also has a couple of BPF extensions
that are used along with the class of load instructions.
 The extensions are "overloading" the k argument with
a negative offset + a particular extension offset.
 The result of such BPF extensions are loaded into A.
DescriptionInstruction
skb->lenlen
skb->protocolproto
skb->pkt_typetype
Payload start offsetpoff
skb->dev->ifindexifidx
skb->markmark
DescriptionInstruction
skb->queue_mappingqueue
skb->hashrxhash
Prandom_u32()rand
Executing cpu idcpu
Netlink attributesnla
skb->dev->typehatype
 extended BPF is an internal mechanism that can be used
only in kernel context (not from userspace!)
 eBPF adds a set of new features:
 Increased number of registers 10 instead of 2.
 Register width increased to 64 bit
 Conditional jf/jt targets replaced with jt/fall-through
 bpf_call instruction and register passing convention for zero
overhead calls from/to other kernel functions.
 Originally designed to be a “restricted C” language that will
be architecture independent and JITed in kernel context.
 Eventually used mostly for kernel tracing.
 SECure COMPuting, or seccomp, is a security
mechanism available in the linux kernel.
 It applies BPF filtering to syscalls
 filter the syscall number & its parameters
 It can be used to limit the available syscalls
 For example strict mode allows only read, write, _exit
and sigreturn
 Uses the BPF filters, the filtered buffers are different.
 Look at man 2 seccomp for more details.
Berkeley Packet Filters

Weitere ähnliche Inhalte

Was ist angesagt?

Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux KernelAdrian Huang
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPFRogerColl2
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and moreBrendan Gregg
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and ToolsBrendan Gregg
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmicsDenys Haryachyy
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)shimosawa
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
 

Was ist angesagt? (20)

Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmics
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Dpdk applications
Dpdk applicationsDpdk applications
Dpdk applications
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
 
Linux Initialization Process (1)
Linux Initialization Process (1)Linux Initialization Process (1)
Linux Initialization Process (1)
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
 

Ähnlich wie Berkeley Packet Filters

Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsGergely Szabó
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!Affan Syed
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPFAlex Maestretti
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWHsien-Hsin Sean Lee, Ph.D.
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28rajeshkvdn
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerMarina Kolpakova
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
 
Revelation pyconuk2016
Revelation pyconuk2016Revelation pyconuk2016
Revelation pyconuk2016Sarah Mount
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!Linaro
 
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesNetronome
 
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFNetronome
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFBrendan Gregg
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumMichal Rostecki
 

Ähnlich wie Berkeley Packet Filters (20)

Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native Environments
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIWLec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
My seminar new 28
My seminar new 28My seminar new 28
My seminar new 28
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
Revelation pyconuk2016
Revelation pyconuk2016Revelation pyconuk2016
Revelation pyconuk2016
 
BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!BKK16-103 OpenCSD - Open for Business!
BKK16-103 OpenCSD - Open for Business!
 
Risc vs cisc
Risc vs ciscRisc vs cisc
Risc vs cisc
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
 
Efficient JIT to 32-bit Arches
Efficient JIT to 32-bit ArchesEfficient JIT to 32-bit Arches
Efficient JIT to 32-bit Arches
 
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPFUnifying Network Filtering Rules for the Linux Kernel with eBPF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
 
Basic Linux kernel
Basic Linux kernelBasic Linux kernel
Basic Linux kernel
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
ocelot
ocelotocelot
ocelot
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with Cilium
 

Mehr von Kernel TLV

SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution EnvironmentKernel TLV
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel TLV
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Kernel TLV
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityKernel TLV
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to BottomKernel TLV
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Kernel TLV
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and WhereKernel TLV
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernel TLV
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentKernel TLV
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesKernel TLV
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival GuideKernel TLV
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the BeastKernel TLV
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and DriversKernel TLV
 

Mehr von Kernel TLV (20)

DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution Environment
 
Fun with FUSE
Fun with FUSEFun with FUSE
Fun with FUSE
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem Security
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to Bottom
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and Where
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker Guidelines
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future Development
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the Beast
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
 

Kürzlich hochgeladen

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 

Kürzlich hochgeladen (20)

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 

Berkeley Packet Filters

  • 1. Linux kernel TLV meetup, 28.02.2016 kerneltlv.com Kfir Gollan
  • 2.  What is the Berkeley Packet Filter?  Writing BPF filters  Debuging BPF  Using BPF in user-space applications  Advanced features of BPF
  • 3.  BPF at its base is a way to perform fast packet filtering at the kernel level.  Filters are defined by user space  Filters are executed in the kernel  Invented by Steven McCanne and Van Jacobson in 1990. First publication at December 1992.  Support for BPF in Linux was added by Jay Schulist for the 2.5 development kernel.  Many features were added later on. For example JIT for BPF was added for the 3.0 kernel (2011)
  • 4.  We want to be able to filter packets at the kernel level  Discard irrelevant packets at the kernel without copying them to user space.  The performance gains when using promiscuous mode are substantial.  We want to change filters dynamically without recompiling the kernel or using a custom kernel module.  We want the filters to be architecture independent.
  • 5.
  • 6.
  • 7.  BPF defines a set of operations that can be performed on the filtered packet. Each operation gets its own opcode.  BPF was designed to be protocol indepented, as such it treats the packets as raw buffers. It is up to the filter writer to parse the needed packet headers.  BPF is based on three building blocks:  A: A 32 bit wide accumulator  X: A 32 bit wide index register  M[]: 16 x 32 bit wide misc registers aka “scratch memory”
  • 8.  LOAD: copy a value into the accumulator or index register.  STORE: copy either the accumulator or index register to the scratch memory.  ALU: perform arithmetic or logic operation on the accumulator register.  BRANCH: alter the flow of control.  RETURN: terminate the filter and indicate what portion of the packet to save.  MISC: various operations that doesn’t match the other types.
  • 9. jf: 8jt: 8opcode: 16 k: 32  Each instruction is represented by 64 bits.  opcode 16 bits, indicates the instruction type.  jt 8 bits, offset of the next instruction for true case (“if” block) – Jump True.  jf 8 bits, offset of the next instruction for false case (“else” block) – Jump False.  k 32 bits, generic data, used for various purposes.
  • 10. DescriptionInstruction Load word into Ald Load half-word into Aldh Load byte into Aldb Load word into Xldx Load byte into Xldxb Store A into M[]st Store X into M[]stx Jump to labeljmp Jump on k == Ajeq Jump on k > Ajgt Jump on k >= Ajge Jump on k & Ajset DescriptionInstruction A + <x>add A - <x>sub A * <x>mul A / <x>div A & <x>and A | <x>or A ^ <x>xor A << <x>lsh A >> <x>rsh Returnret Copy A into Xtax Copy X into Atxa
  • 11. DescriptionMode The literal value stored in k.#k The length of the packet.#len The word at offset k in the scratch memory store. M [ k ] The byte, half-word or word at byte offset k in the packet. [ k ] The byte, half-word or word at byte offset x+k in the packet. [ x + k ] Jump to label LL Jump to lt if true, otherwise jump to l f#k, lt. lf The index registerx Four times the value of the low four bits of the byte at the offset k in the packet 4 * ([k] & 0xf)
  • 12.  MAC header  IP header TypeSource MACDestination MAC 2 bytes6 bytes6 bytes
  • 13.  To catch all the IP packets over MAC we need to check the type field in the MAC header. ldh [12] jeq #ETHERTYPE_IP, L1, L2 L1: ret #-1 L2: ret #0  Load half-word (2 bytes) from offset 12 of the packet (the type field) into A register.  Check if A register is equal to #ETHERTYPE_IP  If true -> return #-1  If false -> return 0
  • 14. ldh [12] ; A <= ether.type jeq #ETHERTYPE_IP, L1, L3 ; A == #ETHERTYPE_IP ? L1: ld [26] ; A = ip.src and #0xffffff00 ; A = A & 0xffffff00 jeq #0x80037000, L3, L2 ; A == 128.3.112.0 L2: ret #-1 L3: ret #0  Check if the packet type is IP  “Remove” the lower byte of the src IP  Check if the src IP matches 128.3.112.X
  • 15.
  • 16.  A utility used to create bpf binary code (bpf “assembly” compiler”).  Part of the mainline kernel. tools/net/bpf_asm.c  Supports two output formats  c style output { 0x28, 0, 0, 0x0000000c }, { 0x15, 0, 1, 0x00000800 }, { 0x06, 0, 0, 0xffffffff }, { 0x06, 0, 0, 0000000000 },  raw output 4,40 0 0 12,21 0 1 2048,6 0 0 4294967295,6 0 0 0,
  • 17.  A utility used to debug bpf filters  Part of the mainline kernel. tools/net/bpf_dbg.c  Main features  pcap files as input for filters.  bpf-asm raw output format for bpf filter definition.  single-stepping through filters  breakpoints  internal status (A,X,M, PC)  disassemble raw bpf to bpf-asm
  • 18.
  • 19.
  • 20. struct sock_filter { /* Filter block */ __u16 code; /* Actual filter code */ __u8 jt; /* Jump true */ __u8 jf; /* Jump false */ __u32 k; /* Generic multiuse field */ };  Filter block is in fact a single BPF instruction.  Used to pass a filter specifications to the kernel. struct sock_filter code[] = { { 0x28, 0, 0, 0x0000000c }, { 0x15, 0, 8, 0x000086dd }, { 0x30, 0, 0, 0x00000014 }, … };
  • 21. struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ unsigned short len; /* Number of filter blocks */ struct sock_filter *filter; /* Filter blocks list */ };  A parameter for setsockopt that allows to attach a filter to a socket. struct sock_fprog bpf = {…}; setsockopt(int fd, SOL_SOCKET, SO_ATTACH_FILTER, &bpf. sizeof(bpf));
  • 22.  SO_ATTACH_FILTER Attach a BPF filter to a socket. Note: only a single filter can be attached at a given time.  SO_DETACH_FILTER Remove the currently attached filter from the socket.  SO_LOCK_FILTER Lock a filter on a socket. This is useful for setting a filter and then dropping privileges.  Example: create a raw socket, apply a filter, lock it, drop CAP_NET_RAW
  • 23.  Choosing the correct flags to the socket is critical for making the filter work properly.  The filtered buffer will start in the wanted location in the net stack.  Selecting the socket domain  AF_PACKET – filtering at L2 (e.g ethernet)  AF_INET – IPv4 filtering  AF_INET6 – IPv6 filtering  Selecting the socket type  SOCK_RAW – raw filtering, no headers are handled by the kernel  SOCK_STREAM/SOCK_DGRAM etc – headers are handled by the kernel
  • 24.  libpcap – Packet CAPture library  Provides an easy to use api for packet filtering.  Supports a high level filtering format which is converted to BPF.  pcap_compile – create a bpf filter  pcap_setfilter – attach a filter  Look at man 7 pcap-filter for a detailed description. ether dst [mac address] dst net [ip address] dst portrange [port1]-[port2]
  • 25.  user-space packet sniffing program.  Uses bpf for kernel level filtering (based on pcap)  Dump the generated bpf filter  -d bpf asm format  -dd c format  -ddd bpf raw format $ sudo tcpdump -d "ip and udp“ (000) ldh [12] (001) jeq #0x800 jt 2 jf 5 (002) ldb [23] (003) jeq #0x11 jt 4 jf 5 (004) ret #65535 (005) ret #0
  • 26.
  • 27.  A just-in-time BPF instruction translation.  Note that the translation is performed when attaching the filter via bpf_jit_compile(..).  BPF instructions are mapped directly to architecture depended instructions.  BPF registers are mapped to machine physical registers  Provides a performance gain  About 50ns per packet for simple filters (E5540 @ 2.53GHz). The more complex the filter the better performance gains from JIT.  Supported on x86,x86-64, powerpc. arm and more.
  • 28.  Each BPF filter is verified before attaching it to a socket.  This is critical, the filters come from userspace!  The following rules are enforced  The filter must not contain references or jumps that are out of range.  The filter must contain only valid BPF opcodes.  The filter must end with RET opcode.  All jumps are forward – loops are not allowed!  The verification is implemented at sk_chk_filter function in net/core/filter.c (until kernel 3.16), modified after adding seccomp.
  • 29.  The Linux kernel also has a couple of BPF extensions that are used along with the class of load instructions.  The extensions are "overloading" the k argument with a negative offset + a particular extension offset.  The result of such BPF extensions are loaded into A. DescriptionInstruction skb->lenlen skb->protocolproto skb->pkt_typetype Payload start offsetpoff skb->dev->ifindexifidx skb->markmark DescriptionInstruction skb->queue_mappingqueue skb->hashrxhash Prandom_u32()rand Executing cpu idcpu Netlink attributesnla skb->dev->typehatype
  • 30.  extended BPF is an internal mechanism that can be used only in kernel context (not from userspace!)  eBPF adds a set of new features:  Increased number of registers 10 instead of 2.  Register width increased to 64 bit  Conditional jf/jt targets replaced with jt/fall-through  bpf_call instruction and register passing convention for zero overhead calls from/to other kernel functions.  Originally designed to be a “restricted C” language that will be architecture independent and JITed in kernel context.  Eventually used mostly for kernel tracing.
  • 31.
  • 32.  SECure COMPuting, or seccomp, is a security mechanism available in the linux kernel.  It applies BPF filtering to syscalls  filter the syscall number & its parameters  It can be used to limit the available syscalls  For example strict mode allows only read, write, _exit and sigreturn  Uses the BPF filters, the filtered buffers are different.  Look at man 2 seccomp for more details.

Hinweis der Redaktion

  1. I used the following LWN articles for this slide: BPF: the universal in-kernel virtual machine: https://lwn.net/Articles/599755/ A JIT for packet filters: http://lwn.net/Articles/437981/
  2. Image taken from: http://www.tiger1997.jp/report/activity/securityreport_20131111.html
  3. Information taken from: https://www.kernel.org/doc/Documentation/networking/filter.txt
  4. Taken from: The BSD Packet Filter: A New Architecture for User-level Packet Capture http://www.tcpdump.org/papers/bpf-usenix93.pdf
  5. Note: this is only a subset of the BPF instruction set. Taken from https://www.kernel.org/doc/Documentation/networking/filter.txt
  6. https://www.kernel.org/doc/Documentation/networking/filter.txt
  7. https://www.kernel.org/doc/Documentation/networking/filter.txt
  8. taken from https://blog.cloudflare.com/bpf-the-forgotten-bytecode/
  9. JIT benchmark can be found here https://lwn.net/Articles/437986/
  10. http://lxr.free-electrons.com/source/net/core/filter.c?v=3.16#L1230
  11. http://www.brendangregg.com/blog/2015-05-15/ebpf-one-small-step.html