SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
VLANs in Linux kernel
how simple things might get quite complicated
Jiří Pírko <jiri@mellanox.com>
Who am I?
● A Linux kernel developer/network developer
● First patch accepted to Linux kernel in October 2008 - book name fix in documentation :-)
● Author of a bonding driver replacement - team driver and libteam (http://libteam.org)
● Founder of automated and portable network testing framework called LNST (http://lnst-project.org)
● Started a “true open switch” initiative called switchdev
● Co-author of rocker qemu switch implementation and rocker driver
● Co-author of mlxsw - driver for Mellanox SwitchX-2 and Spectrum ASICs
VLAN use-case - problem
Coca-Cola
Port 1 Port 2 Port 3 Port 4
Port 1 Port 2 Port 3 Port 4
Pepsi
Coca-Cola Pepsi
VLAN use-case - solution
Coca-Cola
Port 1 Port 2 Port 3 Port 4
Port 1 Port 2 Port 3 Port 4
Pepsi
Coca-Cola Pepsi
VLAN ID 100 - Coca-Cola
VLAN ID 200 - Pepsi
802.1Q VLAN packets
Destination MAC Source MAC
EtherType/
Size
Payload
Destination MAC Source MAC
EtherType/
Size
Payload
802.1Q
header
12 bits
TCI
PCP DEI VID
TPID
16 bits 3 bits 1 bit
Packet format: 802.1Q header format:
● TPID (Tag protocol identifier): In the same position as EtherType/Size. It is set to value of 0x8100 - by that you can identify
802.1Q tagged packet and distinguish from untagged packets
● TCI (Tag control information)
○ PCP (Priority code point): Priority according to 802.1p, 7 is highest. Used for QoS
○ DEI (Drop eligible indicator): Formerly CFI. Indicates is packet is suitable for being dropped in case of congestion
○ VID (VLAN identifier): Specifies the VLAN to which the packet belongs. Values are in range 0-4094. Value 0 has a
special meaning, indicates that the packet does not belong to any VLAN. The purpose of that is to allow to use PCP
for non-VLAN packets
Used terms and colors
● struct net_device *dev
○ Referred to as dev, skb->dev
○ One instance for each network device
● struct sk_buff *skb
○ Referred to as skb
○ One instance for every incoming and outgoing packet
● struct net_device_ops *ops
○ Referred to as ops, dev->ops, ndos (net_device ops)
○ Set of callbacks that each driver defines for core to call
● Vlan data path - red
● Vlan accelerated data path - pink
VLAN userspace interfaces in Linux kernel
● Ioctl-based
○ Introduced along with the initial VLAN implementation in 2002
○ Userspace tool is called vconfig:
# vconfig add eth0 100
Added VLAN with VID == 100 to IF -:eth0:-
# ip address add 192.168.0.1/24 dev eth0.100
● Netlink-based
○ Introduced by following commit:
commit 07b5b17e157b7018d0ca40ca0d1581a23096fb45
Author: Patrick McHardy <kaber@trash.net>
Date: Wed Jun 13 12:07:54 2007 -0700
[VLAN]: Use rtnl_link API
○ Extends use of ip tool (a part of iproute2 package):
# ip link add link eth0 name eth0.100 type vlan id 100
# ip address add 192.168.0.1/24 dev eth0.100
Simplified RX path of packet in Linux kernel
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
bridge, bonding, team,
macvlan, openvswitch, ...
NIC driver
(eth0)
RX ring buffer desc
create skb
RX queue enqueue
RX queue dequeue
packet type “all” taps
hooks (rx_handler)
packet type handlers
Simplified TX path of packet in Linux kernel
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
NIC driver
(eth0)
TX ring buffer desc
create skb
dev_queue_xmit()
dev_queue_xmit_nit()
ndo_start_xmit
enqueue/schedule
Initial VLAN implementation
● Merged in February 2002
● Author: Ben Greear <greearb@candelatech.com>
● One net_device per VID
○ eth0 - real device
○ eth0.100 - vlan device for VID 100
○ eth0.200 - vlan device for VID 200
● On RX:
○ Hook on ETH_P_8021Q (0x8100) packet type with dev_add_pack()
○ Lookup the vlan net_device and adjust skb->dev accordingly
○ Reinject to RX path
● On TX:
○ Implement ops->ndo_start_xmit (was dev->hard_start_xmit at that time)
○ Get real device and set it to skb->dev
○ Reinject to TX path
Initial VLAN implementation - RX path
Vlan code
(eth0.100)
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
bridge, bonding, team,
macvlan, openvswitch, ...
NIC driver
(eth0)
RX ring buffer desc
create skb
RX queue enqueue
RX queue dequeue
packet type “all” taps
hooks (rx_handler)
packet type handlers
Pop vlan header
Change skb->dev to vlan dev
Reinject
type 0x8100 (802.1Q)
Initial VLAN implementation - TX path
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
NIC driver
(eth0)
TX ring buffer desc
create skb
dev_queue_xmit()
dev_queue_xmit_nit()
ndo_start_xmit
enqueue/schedule
Vlan code
(eth0.100)
ndo_start_xmit
Push vlan header
Change skb->dev to real dev
Reinject
VLAN tagging/stripping HW acceleration
● Merged in March 2002
● Author: David S. Miller <davem@nuts.ninka.net>
● Went in together with significant code change
● NIC does vlan header pop and push in HW
● On RX:
○ Driver gets the info about vlan tagging from HW
○ Injects the packet in the RX path differently. It uses vlan_hwaccel_rx and function
● On TX:
○ During vlan device create, accelerated path is selected if the real device has
NETIF_F_HW_VLAN_TX feature on
○ Vlan code puts TCI info including VID into skb->cb cookie, sets skb->dev to real device. Later this is
moved from skb->cb to dedicated skb->vlan_tci.
○ Reinject to TX path
○ Driver get the info by vlan_tx_tag_get() and passes this info to HW along with the packet
VLAN tagging/stripping HW acceleration - RX path
Vlan code
(eth0.100)
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
bridge, bonding, team,
macvlan, openvswitch, ...
NIC driver
(eth0)
RX ring buffer desc
create skb
RX queue enqueue
RX queue dequeue
packet type “all” taps
hooks (rx_handler)
packet type handlers
Pop vlan header
Change skb->dev to vlan dev
Reinject
type 0x8100 (802.1Q)
vlan hwaccel RX
VLAN tagging/stripping HW acceleration - TX path
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
NIC driver
(eth0)
TX ring buffer desc
create skb
dev_queue_xmit()
dev_queue_xmit_nit()
ndo_start_xmit
enqueue/schedule
Vlan code
(eth0.100)
ndo_start_xmit
Push vlan header
Change skb->dev to real dev
Reinject
ndo_start_hwaccel_xmit
Set vlan skb->cb cookie
Change skb->dev to real dev
Reinject
VLAN filtering offload
● Merged in March 2002
● Author: David S. Miller <davem@nuts.ninka.net>
● Unknown vlan packets are filtered-out in HW
● Driver advertises filtering abilities with NETIF_F_HW_VLAN_FILTER feature bit
● Driver implements vlan_rx_register, vlan_rx_add_vid and vlan_rx_kill_vid ops
○ vlan_rx_register pushes down struct vlan_group which is internal to vlan code. This turned out to be
quite pointless but was spread to lot of drivers.
VLAN story is starting to get a bit sad
● In the time, GRO support was added
● Lot of functions drivers may call under various circumstances to get vlan packet down to networking core
○ __vlan_hwaccel_rx
○ vlan_gro_receive
○ vlan_gro_frags
● vlan_hwaccel_do_receive() that sets skb->dev is splitted out from __vlan_hwaccel_rx():
commit 9b22ea560957de1484e6b3e8538f7eef202e3596
Author: Patrick McHardy <kaber@trash.net>
Date: Tue Nov 4 14:49:57 2008 -0800
net: fix packet socket delivery in rx irq handler
The changes to deliver hardware accelerated VLAN packets to packet
sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
The __vlan_hwaccel_rx() function is called directly from the drivers
RX function, for non-NAPI drivers that means its still in RX IRQ
Context.
....
● Bonding gets in the way. More later on.
VLAN model centralization
● Let the driver set skb->vlan_tci using __vlan_hwaccel_put_tag() and push packet down to a networking core
in the same way as non-vlan packets
● The vlan handling code is called from the middle of RX processing (after packet type all taps)
● Patchset finishes with patch:
commit 3701e51382a026cba10c60b03efabe534fba4ca4
Author: Jesse Gross <jesse@nicira.com>
Date: Wed Oct 20 13:56:06 2010 +0000
vlan: Centralize handling of hardware acceleration.
Currently each driver that is capable of vlan hardware acceleration
must be aware of the vlan groups that are configured and then pass
the stripped tag to a specialized receive function. This is
different from other types of hardware offload in that it places a
significant amount of knowledge in the driver itself rather keeping
it in the networking core.
....
VLAN model centralization - RX path
Vlan code
(eth0.100)
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
bridge, bonding, team,
macvlan, openvswitch, ...
NIC driver
(eth0)
RX ring buffer desc
create skb
RX queue enqueue
RX queue dequeue
packet type “all” taps
hooks (rx_handler)
packet type handlers
Pop vlan header
Change skb->dev to vlan dev
Reinject
type 0x8100 (802.1Q)
fill-up skb->vlan_tci
Process skb->vlan_tci
Reinject
VLAN model centralization - TX path
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
NIC driver
(eth0)
TX ring buffer desc
create skb
dev_queue_xmit()
dev_queue_xmit_nit()
ndo_start_xmit
enqueue/schedule
Vlan code
(eth0.100)
ndo_start_xmit
Set skb->vlan_tci
Change skb->dev to real dev
Reinject
Check if dev supports vlan
accel, if not, push header
Accel and non-accel unification
● For RX path only, as TX part was taken care of in “centralization” patchset
● The idea is to “emulate” VLAN HW acceleration
● Untag VLAN header for non-accelerated path early in network core and set skb->vlan_tci. Let the rest of the
processing be same as for accelerated path.
commit bcc6d47903612c3861201cc3a866fb604f26b8b2
Author: Jiri Pirko <jpirko@redhat.com>
Date: Thu Apr 7 19:48:33 2011 +0000
net: vlan: make non-hw-accel rx path similar to hw-accel
Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is
enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into
vlan code in __netif_receive_skb - vlan_hwaccel_do_receive.
For non-rx-vlan-hw-accel however, tagged skb goes thru whole
__netif_receive_skb, it's untagged in ptype_base hander and reinjected
This incosistency is fixed by this patch. Vlan untagging happens early in
__netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers)
see the skb like it was untagged by hw.
Accel and non-accel unification - RX path
Vlan code
(eth0.100)
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
bridge, bonding, team,
macvlan, openvswitch, ...
NIC driver
(eth0)
RX ring buffer desc
create skb
RX queue enqueue
RX queue dequeue
packet type “all” taps
hooks (rx_handler)
packet type handlers
Pop vlan header
Set skb->vlan_tci
fill-up skb->vlan_tci
Process skb->vlan_tci
Reinject
Stacked network devices
● Also called master-slave devices or upper-lower devices
● Bonding, Bridge, Team, Macvlan, Open vSwitch, …
● Master device is attached to slave device
○ On RX, master attaches rx_handler on slave and steals incoming packets
○ On TX, master calls dev_queue_xmit() of slave
● Forms a hierarchy, an example:
eth0 eth1 eth2
bond0
br0 192.168.0.1/24
VLAN issues in combination with stacked devices
● Ordering for RX
○ Vlan device gets bigger priority over master device?
○ Master device gets bigger priority over vlan device?
○ More on next slide
● Vlan filter
○ Master has to propagate down ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid
○ Master has to replay filter setup if add_vid was called before enslavement
Stacked device with VLAN ordering fix
● For RX path only
● Changes the order so the vlan hook is called before rx_handler
commit 2425717b27eb92b175335ca4ff0bb218cbe0cb64
Author: John Fastabend <john.r.fastabend@intel.com>
Date: Mon Oct 10 09:16:41 2011 +0000
net: allow vlan traffic to be received under bond
The following configuration used to work as I expected. At least
we could use the fcoe interfaces to do MPIO and the bond0 iface
to do load balancing or failover.
....
This worked because of a change we added to allow inactive slaves
to rx 'exact' matches. This functionality was kept intact with the
rx_handler mechanism. However now the vlan interface attached to the
active slave never receives traffic because the bonding rx_handler
updates the skb->dev and goto's another_round. Previously, the
vlan_do_receive() logic was called before the bonding rx_handler.
....
Stacked device with VLAN ordering fix - RX path
Vlan code
(eth0.100)
Network core
ARP, IPv4, IPv6, ...
Packet socket (e.g. tcpdump)
bridge, bonding, team,
macvlan, openvswitch, ...
NIC driver
(eth0)
RX ring buffer desc
create skb
RX queue enqueue
RX queue dequeue
packet type “all” taps
hooks (rx_handler)
packet type handlers
Pop vlan header
Set skb->vlan_tci
fill-up skb->vlan_tci
Process skb->vlan_tci
Reinject
VLAN Linux kernel implementation summary
● 14 years of development
● Over 500 commits
● Over 3500 lines of code (net/8021q/, include/linux/if_vlan.h)
● Lots of upset end-users and developers
Alternative VLAN implementation - in Linux bridge
● Merged in February 2013
● Author: Vlad Yasevich <vyasevic@redhat.com>
● Implements vlan filtering in bridge
● Simple example that allows packets with VID 100 to be forwarded between eth0 and eth1:
# ip link add name br0 type bridge
# ip link set dev br0 type bridge vlan_filtering 1
# ip link set eth0 master br0
# ip link set eth1 master br0
# bridge vlan add vid 100 dev eth0
# bridge vlan add vid 100 dev eth1
# bridge vlan show dev eth0
port vlan ids
eth0 1 PVID Egress Untagged
100
● To set PVID and Egress Untagged:
# bridge vlan add vid 100 dev eth0 untagged
# bridge vlan add vid 100 dev eth0 pvid
Alternative VLAN implementation - in Open vSwitch
● OVS is an OpenFlow motivated switch implementation
● Vlan support merged in October 2011 as a part of Open vSwitch kernel datapath introduction:
commit ccb1352e76cff0524e7ccb2074826a092dd13016
Author: Jesse Gross <jesse@nicira.com>
Date: Tue Oct 25 19:26:31 2011 -0700
net: Add Open vSwitch kernel components.
● There is possible to add flows that match packets based on the VID - “vlan flow key”
● There is vlan POP and vlan PUSH action that can be chained to the flow match
recirc_id(0),in_port(2),eth(src=e4:1d:2d:a5:f3:9d,dst=e4:11:22:33:44:52),eth_type(0x8100), 
vlan(vid=53,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:34, bytes:3468, used:0.260s, actions:pop_vlan,5
recirc_id(0),in_port(5),eth(src=e4:11:22:33:44:52,dst=e4:1d:2d:a5:f3:9d),eth_type(0x0800), 
ipv4(frag=no), packets:35, bytes:3438, used:0.260s, actions:push_vlan(vid=53,pcp=0),2
● There is some of the code used from the vlan code, some of the code is implemented on top
● Fixed by:
commit 93515d53b133d66f01aec7b231fa3e40e3d2fd9a
Author: Jiri Pirko <jiri@resnulli.us>
Date: Wed Nov 19 14:05:02 2014 +0100
net: move vlan pop/push functions into common code
Alternative VLAN implementation - in TC
● Implemented as a part of Classifier-Action subsystem of TC (traffic control)
○ Classifiers are used to match on packets: cls_u32, cls_flower, cls_bpf, many others
○ Actions are executed on a successfully matched packet: act_gact, act_mirred, act_skbedit, act_bpf
○ Nice presentation about TC CA from Netdev 0.1: https://www.netdev01.org/sessions/21
● act_vlan was added to allow push and pop vlan header:
commit c7e2b9689ef81362a8091592da6cb6a7723f377a
Author: Jiri Pirko <jiri@resnulli.us>
Date: Wed Nov 19 14:05:03 2014 +0100
sched: introduce vlan action
● Simple example:
# tc filter add dev eth0 parent ffff: protocol all u32 match u32 0 0 
action vlan push id 100 
action mirred egress redirect dev eth1
# tc filter add dev eth1 parent ffff: protocol all u32 match u32 0 0 
action vlan pop 
action mirred egress redirect dev eth0
● There is a plan to extend cls_flower to allow to match on vlan headers
Alternative VLAN implementation - in BPF
● BPF - Berkeley Packet Filter
○ Implemented as a VM with specific instruction set and set of registers
○ Kernel would interpret the BPF program inserted by user
○ Originally served for a filter program to be attached on a socket, now used as “universal in-kernel VM”
○ JIT support for many CPU architectures
○ Extension is called eBPF - more registers, added maps, etc.
● Vlan header info getter and header push and pop support introduced by:
commit c24973957975403521ca76a776c2dfd12fbe9add
Author: Alexei Starovoitov <ast@plumgrid.com>
Date: Mon Mar 16 18:06:02 2015 -0700
bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields
commit 4e10df9a60d96ced321dd2af71da558c6b750078
Author: Alexei Starovoitov <ast@plumgrid.com>
Date: Mon Jul 20 20:34:18 2015 -0700
bpf: introduce bpf_skb_vlan_push/pop() helpers
BPF usage for networking purposes
● TC clsact support added to iproute2 by:
commit 8f9afdd531560c1534be44424669add2e19deeec
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Tue Jan 12 01:42:20 2016 +0100
tc, clsact: add clsact frontend
Add the tc part for the kernel commit 1f211a1b929c ("net, sched: add
clsact qdisc"). Quoting example usage from that commit description:
Example, adding qdisc:
# tc qdisc add dev foo clsact
# tc qdisc show dev foo
qdisc mq 0: root
qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc clsact ffff: parent ffff:fff1
Adding filters (deleting, etc works analogous by specifying ingress/egress):
# tc filter add dev foo ingress bpf da obj bar.o sec ingress
# tc filter add dev foo egress bpf da obj bar.o sec egress
# tc filter show dev foo ingress
filter protocol all pref 49152 bpf
filter protocol all pref 49152 bpf handle 0x1 bar.o:[ingress] direct-action
# tc filter show dev foo egress
filter protocol all pref 49152 bpf
filter protocol all pref 49152 bpf handle 0x1 bar.o:[egress] direct-action
The ingress parent alias can also be used with ingress qdisc.
Questions?
Link to slides:

Weitere ähnliche Inhalte

Was ist angesagt?

Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsHisaki Ohara
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Andriy Berestovskyy
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingMichelle Holley
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack monad bobo
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking WalkthroughThomas Graf
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?Michelle Holley
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
 
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoCapturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoScyllaDB
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDPDaniel T. Lee
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingMichelle Holley
 

Was ist angesagt? (20)

Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
DPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet ProcessingDPDK: Multi Architecture High Performance Packet Processing
DPDK: Multi Architecture High Performance Packet Processing
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
Hands-on ethernet driver
Hands-on ethernet driverHands-on ethernet driver
Hands-on ethernet driver
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in GoCapturing NIC and Kernel TX and RX Timestamps for Packets in Go
Capturing NIC and Kernel TX and RX Timestamps for Packets in Go
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Ixgbe internals
Ixgbe internalsIxgbe internals
Ixgbe internals
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 

Andere mochten auch

Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersKernel TLV
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackKernel TLV
 
Hardware Probing in the Linux Kernel
Hardware Probing in the Linux KernelHardware Probing in the Linux Kernel
Hardware Probing in the Linux KernelKernel TLV
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingKernel TLV
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the BeastKernel TLV
 
Userfaultfd and Post-Copy Migration
Userfaultfd and Post-Copy MigrationUserfaultfd and Post-Copy Migration
Userfaultfd and Post-Copy MigrationKernel TLV
 
Switchdev - No More SDK
Switchdev - No More SDKSwitchdev - No More SDK
Switchdev - No More SDKKernel TLV
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival GuideKernel TLV
 
Linux Security Overview
Linux Security OverviewLinux Security Overview
Linux Security OverviewKernel TLV
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesKernel TLV
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing LandscapeKernel TLV
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and DriversKernel TLV
 
grsecurity and PaX
grsecurity and PaXgrsecurity and PaX
grsecurity and PaXKernel TLV
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking MechanismsKernel TLV
 
Linux Kernel Init Process
Linux Kernel Init ProcessLinux Kernel Init Process
Linux Kernel Init ProcessKernel TLV
 
High Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelHigh Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelKernel TLV
 
Linux Interrupts
Linux InterruptsLinux Interrupts
Linux InterruptsKernel TLV
 
Chapter16 designing distributed and internet systems
Chapter16 designing distributed and internet systemsChapter16 designing distributed and internet systems
Chapter16 designing distributed and internet systemsDhani Ahmad
 

Andere mochten auch (20)

Windows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel DevelopersWindows Internals for Linux Kernel Developers
Windows Internals for Linux Kernel Developers
 
Specializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network StackSpecializing the Data Path - Hooking into the Linux Network Stack
Specializing the Data Path - Hooking into the Linux Network Stack
 
Hardware Probing in the Linux Kernel
Hardware Probing in the Linux KernelHardware Probing in the Linux Kernel
Hardware Probing in the Linux Kernel
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
WiFi and the Beast
WiFi and the BeastWiFi and the Beast
WiFi and the Beast
 
Userfaultfd and Post-Copy Migration
Userfaultfd and Post-Copy MigrationUserfaultfd and Post-Copy Migration
Userfaultfd and Post-Copy Migration
 
Linux IO
Linux IOLinux IO
Linux IO
 
Switchdev - No More SDK
Switchdev - No More SDKSwitchdev - No More SDK
Switchdev - No More SDK
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
 
Linux Security Overview
Linux Security OverviewLinux Security Overview
Linux Security Overview
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
 
FreeBSD and Drivers
FreeBSD and DriversFreeBSD and Drivers
FreeBSD and Drivers
 
grsecurity and PaX
grsecurity and PaXgrsecurity and PaX
grsecurity and PaX
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking Mechanisms
 
Linux Kernel Init Process
Linux Kernel Init ProcessLinux Kernel Init Process
Linux Kernel Init Process
 
High Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux KernelHigh Performance Storage Devices in the Linux Kernel
High Performance Storage Devices in the Linux Kernel
 
Linux Interrupts
Linux InterruptsLinux Interrupts
Linux Interrupts
 
My video is a file, now what?
My video is a file, now what?My video is a file, now what?
My video is a file, now what?
 
Chapter16 designing distributed and internet systems
Chapter16 designing distributed and internet systemsChapter16 designing distributed and internet systems
Chapter16 designing distributed and internet systems
 

Ähnlich wie VLANs in the Linux Kernel

SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/StableSR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stablejuet-y
 
Docker Networking with New Ipvlan and Macvlan Drivers
Docker Networking with New Ipvlan and Macvlan DriversDocker Networking with New Ipvlan and Macvlan Drivers
Docker Networking with New Ipvlan and Macvlan DriversBrent Salisbury
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld
 
Open stack advanced_part
Open stack advanced_partOpen stack advanced_part
Open stack advanced_partlilliput12
 
Hardware accelerated switching with Linux @ SWLUG Talks May 2014
Hardware accelerated switching with Linux @ SWLUG Talks May 2014Hardware accelerated switching with Linux @ SWLUG Talks May 2014
Hardware accelerated switching with Linux @ SWLUG Talks May 2014Nat Morris
 
VyOS Users Meeting #2, VyOSのVXLANの話
VyOS Users Meeting #2, VyOSのVXLANの話VyOS Users Meeting #2, VyOSのVXLANの話
VyOS Users Meeting #2, VyOSのVXLANの話upaa
 
Open stack networking_101_part-2_tech_deep_dive
Open stack networking_101_part-2_tech_deep_diveOpen stack networking_101_part-2_tech_deep_dive
Open stack networking_101_part-2_tech_deep_diveyfauser
 
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6VMware ESXi - Intel and Qlogic NIC throughput difference v0.6
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6David Pasek
 
Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup  Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup Eran Gampel
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterAnne Nicolas
 
Contemporary Linux Networking
Contemporary Linux NetworkingContemporary Linux Networking
Contemporary Linux NetworkingMaximilan Wilhelm
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantShixiong Shang
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stablejuet-y
 
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1Felipe Prado
 
Docker Multihost Networking
Docker Multihost Networking Docker Multihost Networking
Docker Multihost Networking Nicola Kabar
 
How to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-Baljevic
How to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-BaljevicHow to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-Baljevic
How to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-BaljevicCircling Cycle
 
[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN
[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN
[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDNOpenStack Korea Community
 
20151222_Interoperability with ML2: LinuxBridge, OVS and SDN
20151222_Interoperability with ML2: LinuxBridge, OVS and SDN20151222_Interoperability with ML2: LinuxBridge, OVS and SDN
20151222_Interoperability with ML2: LinuxBridge, OVS and SDNSungman Jang
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful ServicesThomas Graf
 
Openstack Networking Internals - first part
Openstack Networking Internals - first partOpenstack Networking Internals - first part
Openstack Networking Internals - first partlilliput12
 

Ähnlich wie VLANs in the Linux Kernel (20)

SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/StableSR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
SR-IOV, KVM and Emulex OneConnect 10Gbps cards on Debian/Stable
 
Docker Networking with New Ipvlan and Macvlan Drivers
Docker Networking with New Ipvlan and Macvlan DriversDocker Networking with New Ipvlan and Macvlan Drivers
Docker Networking with New Ipvlan and Macvlan Drivers
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
Open stack advanced_part
Open stack advanced_partOpen stack advanced_part
Open stack advanced_part
 
Hardware accelerated switching with Linux @ SWLUG Talks May 2014
Hardware accelerated switching with Linux @ SWLUG Talks May 2014Hardware accelerated switching with Linux @ SWLUG Talks May 2014
Hardware accelerated switching with Linux @ SWLUG Talks May 2014
 
VyOS Users Meeting #2, VyOSのVXLANの話
VyOS Users Meeting #2, VyOSのVXLANの話VyOS Users Meeting #2, VyOSのVXLANの話
VyOS Users Meeting #2, VyOSのVXLANの話
 
Open stack networking_101_part-2_tech_deep_dive
Open stack networking_101_part-2_tech_deep_diveOpen stack networking_101_part-2_tech_deep_dive
Open stack networking_101_part-2_tech_deep_dive
 
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6VMware ESXi - Intel and Qlogic NIC throughput difference v0.6
VMware ESXi - Intel and Qlogic NIC throughput difference v0.6
 
Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup  Dragonflow 01 2016 TLV meetup
Dragonflow 01 2016 TLV meetup
 
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverterKernel Recipes 2014 - NDIV: a low overhead network traffic diverter
Kernel Recipes 2014 - NDIV: a low overhead network traffic diverter
 
Contemporary Linux Networking
Contemporary Linux NetworkingContemporary Linux Networking
Contemporary Linux Networking
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
 
SR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/StableSR-IOV+KVM on Debian/Stable
SR-IOV+KVM on Debian/Stable
 
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
DEF CON 27 - workshop - HUGO TROVAO and RUSHIKESH NADEDKAR - scapy dojo v1
 
Docker Multihost Networking
Docker Multihost Networking Docker Multihost Networking
Docker Multihost Networking
 
How to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-Baljevic
How to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-BaljevicHow to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-Baljevic
How to-mount-3 par-san-virtual-copy-onto-rhel-servers-by-Dusan-Baljevic
 
[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN
[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN
[OpenStack 하반기 스터디] Interoperability with ML2: LinuxBridge, OVS and SDN
 
20151222_Interoperability with ML2: LinuxBridge, OVS and SDN
20151222_Interoperability with ML2: LinuxBridge, OVS and SDN20151222_Interoperability with ML2: LinuxBridge, OVS and SDN
20151222_Interoperability with ML2: LinuxBridge, OVS and SDN
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
 
Openstack Networking Internals - first part
Openstack Networking Internals - first partOpenstack Networking Internals - first part
Openstack Networking Internals - first part
 

Mehr von Kernel TLV

Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution EnvironmentKernel TLV
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel TLV
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Kernel TLV
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityKernel TLV
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to BottomKernel TLV
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Kernel TLV
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and WhereKernel TLV
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptablesKernel TLV
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernel TLV
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentKernel TLV
 

Mehr von Kernel TLV (13)

Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
SGX Trusted Execution Environment
SGX Trusted Execution EnvironmentSGX Trusted Execution Environment
SGX Trusted Execution Environment
 
Fun with FUSE
Fun with FUSEFun with FUSE
Fun with FUSE
 
Kernel Proc Connector and Containers
Kernel Proc Connector and ContainersKernel Proc Connector and Containers
Kernel Proc Connector and Containers
 
Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545Bypassing ASLR Exploiting CVE 2015-7545
Bypassing ASLR Exploiting CVE 2015-7545
 
Present Absence of Linux Filesystem Security
Present Absence of Linux Filesystem SecurityPresent Absence of Linux Filesystem Security
Present Absence of Linux Filesystem Security
 
OpenWrt From Top to Bottom
OpenWrt From Top to BottomOpenWrt From Top to Bottom
OpenWrt From Top to Bottom
 
Make Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance ToolsMake Your Containers Faster: Linux Container Performance Tools
Make Your Containers Faster: Linux Container Performance Tools
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
 
File Systems: Why, How and Where
File Systems: Why, How and WhereFile Systems: Why, How and Where
File Systems: Why, How and Where
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
KernelTLV Speaker Guidelines
KernelTLV Speaker GuidelinesKernelTLV Speaker Guidelines
KernelTLV Speaker Guidelines
 
Userfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future DevelopmentUserfaultfd: Current Features, Limitations and Future Development
Userfaultfd: Current Features, Limitations and Future Development
 

Kürzlich hochgeladen

2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdfAndrey Devyatkin
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxSasikiranMarri
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldRoberto Pérez Alcolea
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?Alexandre Beguel
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Mastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxMastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxAS Design & AST.
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfmaor17
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jNeo4j
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfkalichargn70th171
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdfSteve Caron
 

Kürzlich hochgeladen (20)

2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
2024-04-09 - From Complexity to Clarity - AWS Summit AMS.pdf
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptxUnderstanding Plagiarism: Causes, Consequences and Prevention.pptx
Understanding Plagiarism: Causes, Consequences and Prevention.pptx
 
Keeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository worldKeeping your build tool updated in a multi repository world
Keeping your build tool updated in a multi repository world
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?SAM Training Session - How to use EXCEL ?
SAM Training Session - How to use EXCEL ?
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Mastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptxMastering Project Planning with Microsoft Project 2016.pptx
Mastering Project Planning with Microsoft Project 2016.pptx
 
Zer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdfZer0con 2024 final share short version.pdf
Zer0con 2024 final share short version.pdf
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4jGraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
GraphSummit Madrid - Product Vision and Roadmap - Luis Salvador Neo4j
 
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdfPros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
Pros and Cons of Selenium In Automation Testing_ A Comprehensive Assessment.pdf
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
[ CNCF Q1 2024 ] Intro to Continuous Profiling and Grafana Pyroscope.pdf
 

VLANs in the Linux Kernel

  • 1. VLANs in Linux kernel how simple things might get quite complicated Jiří Pírko <jiri@mellanox.com>
  • 2. Who am I? ● A Linux kernel developer/network developer ● First patch accepted to Linux kernel in October 2008 - book name fix in documentation :-) ● Author of a bonding driver replacement - team driver and libteam (http://libteam.org) ● Founder of automated and portable network testing framework called LNST (http://lnst-project.org) ● Started a “true open switch” initiative called switchdev ● Co-author of rocker qemu switch implementation and rocker driver ● Co-author of mlxsw - driver for Mellanox SwitchX-2 and Spectrum ASICs
  • 3. VLAN use-case - problem Coca-Cola Port 1 Port 2 Port 3 Port 4 Port 1 Port 2 Port 3 Port 4 Pepsi Coca-Cola Pepsi
  • 4. VLAN use-case - solution Coca-Cola Port 1 Port 2 Port 3 Port 4 Port 1 Port 2 Port 3 Port 4 Pepsi Coca-Cola Pepsi VLAN ID 100 - Coca-Cola VLAN ID 200 - Pepsi
  • 5. 802.1Q VLAN packets Destination MAC Source MAC EtherType/ Size Payload Destination MAC Source MAC EtherType/ Size Payload 802.1Q header 12 bits TCI PCP DEI VID TPID 16 bits 3 bits 1 bit Packet format: 802.1Q header format: ● TPID (Tag protocol identifier): In the same position as EtherType/Size. It is set to value of 0x8100 - by that you can identify 802.1Q tagged packet and distinguish from untagged packets ● TCI (Tag control information) ○ PCP (Priority code point): Priority according to 802.1p, 7 is highest. Used for QoS ○ DEI (Drop eligible indicator): Formerly CFI. Indicates is packet is suitable for being dropped in case of congestion ○ VID (VLAN identifier): Specifies the VLAN to which the packet belongs. Values are in range 0-4094. Value 0 has a special meaning, indicates that the packet does not belong to any VLAN. The purpose of that is to allow to use PCP for non-VLAN packets
  • 6. Used terms and colors ● struct net_device *dev ○ Referred to as dev, skb->dev ○ One instance for each network device ● struct sk_buff *skb ○ Referred to as skb ○ One instance for every incoming and outgoing packet ● struct net_device_ops *ops ○ Referred to as ops, dev->ops, ndos (net_device ops) ○ Set of callbacks that each driver defines for core to call ● Vlan data path - red ● Vlan accelerated data path - pink
  • 7. VLAN userspace interfaces in Linux kernel ● Ioctl-based ○ Introduced along with the initial VLAN implementation in 2002 ○ Userspace tool is called vconfig: # vconfig add eth0 100 Added VLAN with VID == 100 to IF -:eth0:- # ip address add 192.168.0.1/24 dev eth0.100 ● Netlink-based ○ Introduced by following commit: commit 07b5b17e157b7018d0ca40ca0d1581a23096fb45 Author: Patrick McHardy <kaber@trash.net> Date: Wed Jun 13 12:07:54 2007 -0700 [VLAN]: Use rtnl_link API ○ Extends use of ip tool (a part of iproute2 package): # ip link add link eth0 name eth0.100 type vlan id 100 # ip address add 192.168.0.1/24 dev eth0.100
  • 8. Simplified RX path of packet in Linux kernel Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) bridge, bonding, team, macvlan, openvswitch, ... NIC driver (eth0) RX ring buffer desc create skb RX queue enqueue RX queue dequeue packet type “all” taps hooks (rx_handler) packet type handlers
  • 9. Simplified TX path of packet in Linux kernel Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) NIC driver (eth0) TX ring buffer desc create skb dev_queue_xmit() dev_queue_xmit_nit() ndo_start_xmit enqueue/schedule
  • 10. Initial VLAN implementation ● Merged in February 2002 ● Author: Ben Greear <greearb@candelatech.com> ● One net_device per VID ○ eth0 - real device ○ eth0.100 - vlan device for VID 100 ○ eth0.200 - vlan device for VID 200 ● On RX: ○ Hook on ETH_P_8021Q (0x8100) packet type with dev_add_pack() ○ Lookup the vlan net_device and adjust skb->dev accordingly ○ Reinject to RX path ● On TX: ○ Implement ops->ndo_start_xmit (was dev->hard_start_xmit at that time) ○ Get real device and set it to skb->dev ○ Reinject to TX path
  • 11. Initial VLAN implementation - RX path Vlan code (eth0.100) Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) bridge, bonding, team, macvlan, openvswitch, ... NIC driver (eth0) RX ring buffer desc create skb RX queue enqueue RX queue dequeue packet type “all” taps hooks (rx_handler) packet type handlers Pop vlan header Change skb->dev to vlan dev Reinject type 0x8100 (802.1Q)
  • 12. Initial VLAN implementation - TX path Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) NIC driver (eth0) TX ring buffer desc create skb dev_queue_xmit() dev_queue_xmit_nit() ndo_start_xmit enqueue/schedule Vlan code (eth0.100) ndo_start_xmit Push vlan header Change skb->dev to real dev Reinject
  • 13. VLAN tagging/stripping HW acceleration ● Merged in March 2002 ● Author: David S. Miller <davem@nuts.ninka.net> ● Went in together with significant code change ● NIC does vlan header pop and push in HW ● On RX: ○ Driver gets the info about vlan tagging from HW ○ Injects the packet in the RX path differently. It uses vlan_hwaccel_rx and function ● On TX: ○ During vlan device create, accelerated path is selected if the real device has NETIF_F_HW_VLAN_TX feature on ○ Vlan code puts TCI info including VID into skb->cb cookie, sets skb->dev to real device. Later this is moved from skb->cb to dedicated skb->vlan_tci. ○ Reinject to TX path ○ Driver get the info by vlan_tx_tag_get() and passes this info to HW along with the packet
  • 14. VLAN tagging/stripping HW acceleration - RX path Vlan code (eth0.100) Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) bridge, bonding, team, macvlan, openvswitch, ... NIC driver (eth0) RX ring buffer desc create skb RX queue enqueue RX queue dequeue packet type “all” taps hooks (rx_handler) packet type handlers Pop vlan header Change skb->dev to vlan dev Reinject type 0x8100 (802.1Q) vlan hwaccel RX
  • 15. VLAN tagging/stripping HW acceleration - TX path Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) NIC driver (eth0) TX ring buffer desc create skb dev_queue_xmit() dev_queue_xmit_nit() ndo_start_xmit enqueue/schedule Vlan code (eth0.100) ndo_start_xmit Push vlan header Change skb->dev to real dev Reinject ndo_start_hwaccel_xmit Set vlan skb->cb cookie Change skb->dev to real dev Reinject
  • 16. VLAN filtering offload ● Merged in March 2002 ● Author: David S. Miller <davem@nuts.ninka.net> ● Unknown vlan packets are filtered-out in HW ● Driver advertises filtering abilities with NETIF_F_HW_VLAN_FILTER feature bit ● Driver implements vlan_rx_register, vlan_rx_add_vid and vlan_rx_kill_vid ops ○ vlan_rx_register pushes down struct vlan_group which is internal to vlan code. This turned out to be quite pointless but was spread to lot of drivers.
  • 17. VLAN story is starting to get a bit sad ● In the time, GRO support was added ● Lot of functions drivers may call under various circumstances to get vlan packet down to networking core ○ __vlan_hwaccel_rx ○ vlan_gro_receive ○ vlan_gro_frags ● vlan_hwaccel_do_receive() that sets skb->dev is splitted out from __vlan_hwaccel_rx(): commit 9b22ea560957de1484e6b3e8538f7eef202e3596 Author: Patrick McHardy <kaber@trash.net> Date: Tue Nov 4 14:49:57 2008 -0800 net: fix packet socket delivery in rx irq handler The changes to deliver hardware accelerated VLAN packets to packet sockets (commit bc1d0411) caused a warning for non-NAPI drivers. The __vlan_hwaccel_rx() function is called directly from the drivers RX function, for non-NAPI drivers that means its still in RX IRQ Context. .... ● Bonding gets in the way. More later on.
  • 18. VLAN model centralization ● Let the driver set skb->vlan_tci using __vlan_hwaccel_put_tag() and push packet down to a networking core in the same way as non-vlan packets ● The vlan handling code is called from the middle of RX processing (after packet type all taps) ● Patchset finishes with patch: commit 3701e51382a026cba10c60b03efabe534fba4ca4 Author: Jesse Gross <jesse@nicira.com> Date: Wed Oct 20 13:56:06 2010 +0000 vlan: Centralize handling of hardware acceleration. Currently each driver that is capable of vlan hardware acceleration must be aware of the vlan groups that are configured and then pass the stripped tag to a specialized receive function. This is different from other types of hardware offload in that it places a significant amount of knowledge in the driver itself rather keeping it in the networking core. ....
  • 19. VLAN model centralization - RX path Vlan code (eth0.100) Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) bridge, bonding, team, macvlan, openvswitch, ... NIC driver (eth0) RX ring buffer desc create skb RX queue enqueue RX queue dequeue packet type “all” taps hooks (rx_handler) packet type handlers Pop vlan header Change skb->dev to vlan dev Reinject type 0x8100 (802.1Q) fill-up skb->vlan_tci Process skb->vlan_tci Reinject
  • 20. VLAN model centralization - TX path Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) NIC driver (eth0) TX ring buffer desc create skb dev_queue_xmit() dev_queue_xmit_nit() ndo_start_xmit enqueue/schedule Vlan code (eth0.100) ndo_start_xmit Set skb->vlan_tci Change skb->dev to real dev Reinject Check if dev supports vlan accel, if not, push header
  • 21. Accel and non-accel unification ● For RX path only, as TX part was taken care of in “centralization” patchset ● The idea is to “emulate” VLAN HW acceleration ● Untag VLAN header for non-accelerated path early in network core and set skb->vlan_tci. Let the rest of the processing be same as for accelerated path. commit bcc6d47903612c3861201cc3a866fb604f26b8b2 Author: Jiri Pirko <jpirko@redhat.com> Date: Thu Apr 7 19:48:33 2011 +0000 net: vlan: make non-hw-accel rx path similar to hw-accel Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into vlan code in __netif_receive_skb - vlan_hwaccel_do_receive. For non-rx-vlan-hw-accel however, tagged skb goes thru whole __netif_receive_skb, it's untagged in ptype_base hander and reinjected This incosistency is fixed by this patch. Vlan untagging happens early in __netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers) see the skb like it was untagged by hw.
  • 22. Accel and non-accel unification - RX path Vlan code (eth0.100) Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) bridge, bonding, team, macvlan, openvswitch, ... NIC driver (eth0) RX ring buffer desc create skb RX queue enqueue RX queue dequeue packet type “all” taps hooks (rx_handler) packet type handlers Pop vlan header Set skb->vlan_tci fill-up skb->vlan_tci Process skb->vlan_tci Reinject
  • 23. Stacked network devices ● Also called master-slave devices or upper-lower devices ● Bonding, Bridge, Team, Macvlan, Open vSwitch, … ● Master device is attached to slave device ○ On RX, master attaches rx_handler on slave and steals incoming packets ○ On TX, master calls dev_queue_xmit() of slave ● Forms a hierarchy, an example: eth0 eth1 eth2 bond0 br0 192.168.0.1/24
  • 24. VLAN issues in combination with stacked devices ● Ordering for RX ○ Vlan device gets bigger priority over master device? ○ Master device gets bigger priority over vlan device? ○ More on next slide ● Vlan filter ○ Master has to propagate down ndo_vlan_rx_add_vid/ndo_vlan_rx_kill_vid ○ Master has to replay filter setup if add_vid was called before enslavement
  • 25. Stacked device with VLAN ordering fix ● For RX path only ● Changes the order so the vlan hook is called before rx_handler commit 2425717b27eb92b175335ca4ff0bb218cbe0cb64 Author: John Fastabend <john.r.fastabend@intel.com> Date: Mon Oct 10 09:16:41 2011 +0000 net: allow vlan traffic to be received under bond The following configuration used to work as I expected. At least we could use the fcoe interfaces to do MPIO and the bond0 iface to do load balancing or failover. .... This worked because of a change we added to allow inactive slaves to rx 'exact' matches. This functionality was kept intact with the rx_handler mechanism. However now the vlan interface attached to the active slave never receives traffic because the bonding rx_handler updates the skb->dev and goto's another_round. Previously, the vlan_do_receive() logic was called before the bonding rx_handler. ....
  • 26. Stacked device with VLAN ordering fix - RX path Vlan code (eth0.100) Network core ARP, IPv4, IPv6, ... Packet socket (e.g. tcpdump) bridge, bonding, team, macvlan, openvswitch, ... NIC driver (eth0) RX ring buffer desc create skb RX queue enqueue RX queue dequeue packet type “all” taps hooks (rx_handler) packet type handlers Pop vlan header Set skb->vlan_tci fill-up skb->vlan_tci Process skb->vlan_tci Reinject
  • 27. VLAN Linux kernel implementation summary ● 14 years of development ● Over 500 commits ● Over 3500 lines of code (net/8021q/, include/linux/if_vlan.h) ● Lots of upset end-users and developers
  • 28. Alternative VLAN implementation - in Linux bridge ● Merged in February 2013 ● Author: Vlad Yasevich <vyasevic@redhat.com> ● Implements vlan filtering in bridge ● Simple example that allows packets with VID 100 to be forwarded between eth0 and eth1: # ip link add name br0 type bridge # ip link set dev br0 type bridge vlan_filtering 1 # ip link set eth0 master br0 # ip link set eth1 master br0 # bridge vlan add vid 100 dev eth0 # bridge vlan add vid 100 dev eth1 # bridge vlan show dev eth0 port vlan ids eth0 1 PVID Egress Untagged 100 ● To set PVID and Egress Untagged: # bridge vlan add vid 100 dev eth0 untagged # bridge vlan add vid 100 dev eth0 pvid
  • 29. Alternative VLAN implementation - in Open vSwitch ● OVS is an OpenFlow motivated switch implementation ● Vlan support merged in October 2011 as a part of Open vSwitch kernel datapath introduction: commit ccb1352e76cff0524e7ccb2074826a092dd13016 Author: Jesse Gross <jesse@nicira.com> Date: Tue Oct 25 19:26:31 2011 -0700 net: Add Open vSwitch kernel components. ● There is possible to add flows that match packets based on the VID - “vlan flow key” ● There is vlan POP and vlan PUSH action that can be chained to the flow match recirc_id(0),in_port(2),eth(src=e4:1d:2d:a5:f3:9d,dst=e4:11:22:33:44:52),eth_type(0x8100), vlan(vid=53,pcp=0),encap(eth_type(0x0800),ipv4(frag=no)), packets:34, bytes:3468, used:0.260s, actions:pop_vlan,5 recirc_id(0),in_port(5),eth(src=e4:11:22:33:44:52,dst=e4:1d:2d:a5:f3:9d),eth_type(0x0800), ipv4(frag=no), packets:35, bytes:3438, used:0.260s, actions:push_vlan(vid=53,pcp=0),2 ● There is some of the code used from the vlan code, some of the code is implemented on top ● Fixed by: commit 93515d53b133d66f01aec7b231fa3e40e3d2fd9a Author: Jiri Pirko <jiri@resnulli.us> Date: Wed Nov 19 14:05:02 2014 +0100 net: move vlan pop/push functions into common code
  • 30. Alternative VLAN implementation - in TC ● Implemented as a part of Classifier-Action subsystem of TC (traffic control) ○ Classifiers are used to match on packets: cls_u32, cls_flower, cls_bpf, many others ○ Actions are executed on a successfully matched packet: act_gact, act_mirred, act_skbedit, act_bpf ○ Nice presentation about TC CA from Netdev 0.1: https://www.netdev01.org/sessions/21 ● act_vlan was added to allow push and pop vlan header: commit c7e2b9689ef81362a8091592da6cb6a7723f377a Author: Jiri Pirko <jiri@resnulli.us> Date: Wed Nov 19 14:05:03 2014 +0100 sched: introduce vlan action ● Simple example: # tc filter add dev eth0 parent ffff: protocol all u32 match u32 0 0 action vlan push id 100 action mirred egress redirect dev eth1 # tc filter add dev eth1 parent ffff: protocol all u32 match u32 0 0 action vlan pop action mirred egress redirect dev eth0 ● There is a plan to extend cls_flower to allow to match on vlan headers
  • 31. Alternative VLAN implementation - in BPF ● BPF - Berkeley Packet Filter ○ Implemented as a VM with specific instruction set and set of registers ○ Kernel would interpret the BPF program inserted by user ○ Originally served for a filter program to be attached on a socket, now used as “universal in-kernel VM” ○ JIT support for many CPU architectures ○ Extension is called eBPF - more registers, added maps, etc. ● Vlan header info getter and header push and pop support introduced by: commit c24973957975403521ca76a776c2dfd12fbe9add Author: Alexei Starovoitov <ast@plumgrid.com> Date: Mon Mar 16 18:06:02 2015 -0700 bpf: allow BPF programs access 'protocol' and 'vlan_tci' fields commit 4e10df9a60d96ced321dd2af71da558c6b750078 Author: Alexei Starovoitov <ast@plumgrid.com> Date: Mon Jul 20 20:34:18 2015 -0700 bpf: introduce bpf_skb_vlan_push/pop() helpers
  • 32. BPF usage for networking purposes ● TC clsact support added to iproute2 by: commit 8f9afdd531560c1534be44424669add2e19deeec Author: Daniel Borkmann <daniel@iogearbox.net> Date: Tue Jan 12 01:42:20 2016 +0100 tc, clsact: add clsact frontend Add the tc part for the kernel commit 1f211a1b929c ("net, sched: add clsact qdisc"). Quoting example usage from that commit description: Example, adding qdisc: # tc qdisc add dev foo clsact # tc qdisc show dev foo qdisc mq 0: root qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc clsact ffff: parent ffff:fff1 Adding filters (deleting, etc works analogous by specifying ingress/egress): # tc filter add dev foo ingress bpf da obj bar.o sec ingress # tc filter add dev foo egress bpf da obj bar.o sec egress # tc filter show dev foo ingress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[ingress] direct-action # tc filter show dev foo egress filter protocol all pref 49152 bpf filter protocol all pref 49152 bpf handle 0x1 bar.o:[egress] direct-action The ingress parent alias can also be used with ingress qdisc.