SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
Networking in Userspace
   Living on the edge




  Stephen Hemminger
  stephen@networkplumber.org
Problem Statement
                                     20,000,000
Packets per second (bidirectional)




                                     15,000,000



                                     10,000,000



                                      5,000,000



                                              0
                                              64 208 352 496 640 784 928 1072 121613601504

                                                        Packet Size (bytes)



                                                                              Intel: DPDK Overview
Server vs Infrastructure
   Server Packets              Network Infrastructure
  Packet Size    1024 bytes     Packet Size       64 bytes

Packets/second   1.2 Million   Packets/second   14.88 Million
                                Arrival rate       67.2 ns
  Arrival rate     835 ns       2 GHz Clock      135 cycles
                                   cycles
    2 GHz        1670 cycles
                                3 Ghz Clock      201 cycles
    3 Ghz        2505 cycles       cycles


L3 hit on Intel® Xeon® ~40 cycles
L3 miss, memory read is (201 cycles at 3 GHz)
Traditional Linux networking
TCP Offload Engine
Good old sockets




Flexible, portable but slow
Memory mapped buffers




Efficient, but still constrained by architecture
Run in kernel
The OpenOnload architecture

      Network hardware provides a user-safe interface which
      can route Ethernet packets to an application context
      based on flow information contained within headers
                   Kernel          Application     Application
                   Context          Context         Context

                  Application      Application     Application




                             Protocol               Protocol


                                                     Driver
                       Network Driver




                                                      DMA        No new protocols
                                DMA

                                 Network Adaptor




Slide 7
The OpenOnload architecture

      Protocol processing can take place both in the
      application and kernel context for a given flow

                    Kernel          Application     Application
                    Context          Context         Context

                   Application      Application     Application




                              Protocol               Protocol     Enables persistent / asynchronous
                                                                             processing

                                                      Driver
                        Network Driver
                                                                         Maintains existing
                                                                       network control-plane

                                                       DMA
                                 DMA

                                  Network Adaptor




Slide 8
The OpenOnload architecture

      Protocol state is shared between the kernel and
      application contexts through a protected shared
      memory communications channel
                   Kernel          Application     Application
                   Context          Context         Context

                   Application     Application     Application




                             Protocol               Protocol         Enables correct handling of
                                                                 protocol state with high-performance

                                                     Driver
                        Network Driver




                                                      DMA
                                 DMA

                                 Network Adaptor




Slide 9
Performance metrics

      Overhead
           – Networking overheads take CPU time away from your application

      Latency
           – Holds your application up when it has nothing else to do
           – H/W + flight time + overhead

      Bandwidth
           – Dominates latency when messages are large
           – Limited by: algorithms, buffering and overhead

      Scalability
           – Determines how overhead grows as you add cores, memory, threads, sockets
             etc.


Slide 11
Anatomy of kernel-based networking




Slide 12
A user-level architecture?




Slide 13
Direct & safe hardware access




Slide 14
Some performance results


      Test platform: typical commodity server
           – Intel clovertown 2.3 GHz quad-core xeon (x1)
             1.3 GHz FSB, 2 Gb RAM
           – Intel 5000X chipset
           – Solarflare Solarstorm SFC4000 (B) controller, CX4
           – Back-to-back
           – RedHat Enterprise 5 (2.6.18-8.el5)




Slide 88
Performance: Latency and overhead

      TCP ping-pong with 4 byte payload
      70 byte frame: 14+20+20+12+4

                       ½ round-trip latency    CPU overhead
                         (microseconds)       (microseconds)
 Hardware                      4.2                  --

 Kernel                       11.2                 7.0

 Onload                        5.3                 1.1


Slide 89
Performance: Streaming bandwidth




Slide 92
Performance: UDP transmit

      Nessage rate:
           – 4 byte UDP payload (46 byte
             frame)



                               Kernel      Onload


 1 sender                      473,000     2,030,000




Slide 93
Performance: UDP transmit

      Nessage rate:
           – 4 byte UDP payload (46 byte
             frame)



                               Kernel      Onload


 1 sender                      473,000     2,030,000


 2 senders                     532,000     3,880,000



Slide 94
Performance: UDP receive




Slide 95
OpenOnload Open Source

      OpenOnload available as Open Source (GPLv2)
            – Please contact us if you’re interested

      Compatible with x86 (ia32, amd64/emt64)

      Currently supports SMC10GPCIe-XFP and SMC10GPCIe-10BT
      NICs
            – Could support other user-accessible network interfaces

      Very interested in user feedback
            – On the technology and project directions


Slide 100
Netmap
        http://info.iet.unipi.it/~luigi/netmap/
●
    BSD (and Linux port)
●
    Good scalability
●
    Libpcap emulation
Netmap
Netmap API
●
    Access
    –   open("/dev/netmap")
    –   ioctl(fd, NIOCREG, arg)
    –   mmap(..., fd, 0) maps buffers and rings
●
    Transmit
    –   fill up to avail buffers, starting from slot cur.
    –   ioctl(fd,NIOCTXSYNC) queues the packets
●
    Receive
    –   ioctl(fd,NIOCRXSYNC) reports newly received packets
    –   process up to avail buffers, starting from slot cur.


                       These ioctl()s are non-blocking.
Netmap API: synchronization
●   poll() and select(), what else!
    –   POLLIN and POLLOUT decide which sets of rings to
        work on
    –   work as expected, returning when avail>0
    –   interrupt mitigation delays are propagated up to
        the userspace process
Netmap: multiqueue
●
    Of course.
    –   one netmap ring per physical ring
    –   by default, the fd is bound to all rings
    –   ioctl(fd, NIOCREG, arg) can restrict the binding
        to a single ring pair
    –   multiple fd's can be bound to different rings on the same
        card
    –   the fd's can be managed by different threads
    –   threads mapped to cores with pthread_setaffinity()
Netmap and the host stack
●
    While in netmap mode, the control path remains unchanged:
    –   ifconfig, ioctl's, etc still work as usual
    –   the OS still believes the interface is there
●
    The data path is detached from the host stack:
    –   packets from NIC end up in RX netmap rings
    –   packets from TX netmap rings are sent to the NIC
●
    The host stack is attached to an extra netmap rings:
    –   packets from the host go to a SW RX netmap ring
    –   packets from a SW TX netmap ring are sent to the host
    –   these rings are managed using the netmap API
Netmap: Tx performance
Netmap: Rx Performance
Netmap Summary
Packet Forwarding     Mpps

Freebsd bridging      0.690

Netmap + libpcap      7.500

Netmap                14.88

Open vSwitch          Mpps

userspace             0.065

linux                 0.600

FreeBSD               0.790

FreeBSD+netmap/pcap   3.050
Intel DPDK Architecture
The Intel® DPDK Philosophy


                                                                   Intel® DPDK Fundamentals
                                                                   •   Implements a run to completion model or
                                                                       pipeline model
                                                                   •   No scheduler - all devices accessed by
                                                                       polling
                                                                   •   Supports 32-bit and 64-bit with/without
                                                                       NUMA
                                                                   •   Scales from Intel® Atom™ to Intel®
                                                                       Xeon® processors
                                                                   •   Number of Cores and Processors not
                                                                       limited
                                                                   •   Optimal packet allocation across DRAM
                                                                       channels
      Control
      Plane                       Data Plane




 • Must run on any IA CPU                                Provide software examples that
     ‒ From Intel® Atom™ processor to the                address common network
       latest Intel® Xeon® processor family              performance deficits
     ‒ Essential to the IA value proposition              ‒   Best practices for software architecture
     ‒                                                    ‒   Tips for data structure design and storage
 • Focus on the fast-path                                 ‒   Help the compiler generate optimum code
     ‒ Sending large number of packets to the             ‒   Address the challenges of achieving 80
       Linux Kernel /GPOS will bog the system down            Mpps per CPU Socket




20     Intel Restricted Secret
                                     TRANSFORMING COMMUNICATIONS
                                     TRANSFORMING COMMUNICATIONS
Intel® Data Plane Development Kit (Intel® DPDK)
Intel® DPDK embeds optimizations for                    Intel® DPDK
                                                        Libraries
the IA platform:
- Data Plane Libraries and Optimized NIC                                                  Customer
Drivers in Linux User Space                               Buffer Management               Application

                                                          Queue/Ring Functions            Customer
-   Run-time Environment
                                                                                          Application
                                                          Packet Flow
                                                          Classification
-   Environment Abstraction Layer and Boot Code                                           Customer
                                                          NIC Poll Mode Library           Application
- BSD-licensed & source downloadable from
Intel and leading ecopartners                           Environment Abstraction Layer

                                                                                                       User Space
                                                                                                   Kernel Space

                                                        Environment Abstraction Layer
                                                                                        Linux Kernel




                                                        Platform Hardware




21      Intel Restricted Secret
                                  TRANSFORMING COMMUNICATIONS
                                  TRANSFORMING COMMUNICATIONS
Intel® DPDK Libraries and Drivers

     • Memory Manager: Responsible for allocating pools of objects in memory. A pool is
       created in huge page memory space and uses a ring to store free objects. It also
       provides an alignment helper to ensure that objects are padded to spread them
       equally on all DRAM channels.
     • Buffer Manager: Reduces by a significant amount the time the operating system
       spends allocating and de-allocating buffers. The Intel® DPDK pre-allocates fixed size
       buffers which are stored in memory pools.
     • Queue Manager:: Implements safe lockless queues, instead of using spinlocks, that
       allow different software components to process packets, while avoiding unnecessary
       wait times.
     • Flow Classification: Provides an efficient mechanism which incorporates Intel®
       Streaming SIMD Extensions (Intel® SSE) to produce a hash based on tuple
       information so that packets may be placed into flows quickly for processing, thus
       greatly improving throughput.
     • Poll Mode Drivers: The Intel® DPDK includes Poll Mode Drivers for 1 GbE and 10 GbE
       Ethernet* controllers which are designed to work without asynchronous, interrupt-
       based signaling mechanisms, which greatly speeds up the packet pipeline.




22      Intel Restricted Secret
                                  TRANSFORMING COMMUNICATIONS
                                  TRANSFORMING COMMUNICATIONS
Intel® DPDK Native and Virtualized
     Forwarding Performance




23    Intel Restricted Secret
                                TRANSFORMING COMMUNICATIONS
                                TRANSFORMING COMMUNICATIONS
Comparison
             Netmap           DPDK           OpenOnload


License      BSD              BSD            GPL


API          Packet + pcap    Packet + lib   Sockets


Kernel       Yes              Yes            Yes


HW support   Intel, realtek   Intel          Solarflare


OS           FreeBSD, Linux   Linux          Linux
Issues
●
    Out of tree kernel code
    –   Non standard drivers
●
    Resource sharing
    –   CPU
    –   NIC
●
    Security
    –   No firewall
    –   DMA isolation
What's needed?
●
    Netmap
    –   Linux version (not port)
    –   Higher level protocols?
●
    DPDK
    –   Wider device support
    –   Ask Intel
●
    Openonload
    –   Ask Solarflare
●
    OpenOnload
    –   A user-level network stack (Google tech talk)
        ●
            Steve Pope
        ●
            David Riddoch
●
    Netmap - Luigi Rizzo
    –   http://info.iet.unipi.it/~luigi/netmap/talk-atc12.html
●
    DPDK
    –   Intel DPDK Overview
    –   Disruptive network IP networking
        ●
            Naoto MASMOTO
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?Michelle Holley
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelDivye Kapoor
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchTe-Yen Liu
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network InterfacesKernel TLV
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux KernelKernel TLV
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecturehugo lu
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack monad bobo
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Kentaro Ebisawa
 
Linux Internals - Kernel/Core
Linux Internals - Kernel/CoreLinux Internals - Kernel/Core
Linux Internals - Kernel/CoreShay Cohen
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingMichelle Holley
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking WalkthroughThomas Graf
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughThomas Graf
 

Was ist angesagt? (20)

Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?What are latest new features that DPDK brings into 2018?
What are latest new features that DPDK brings into 2018?
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitch
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
The linux networking architecture
The linux networking architectureThe linux networking architecture
The linux networking architecture
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
 
Dpdk pmd
Dpdk pmdDpdk pmd
Dpdk pmd
 
Linux Internals - Kernel/Core
Linux Internals - Kernel/CoreLinux Internals - Kernel/Core
Linux Internals - Kernel/Core
 
DPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet ProcessingDPDK & Layer 4 Packet Processing
DPDK & Layer 4 Packet Processing
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
DevConf 2014 Kernel Networking Walkthrough
DevConf 2014   Kernel Networking WalkthroughDevConf 2014   Kernel Networking Walkthrough
DevConf 2014 Kernel Networking Walkthrough
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 

Andere mochten auch

Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizationsJeff Squyres
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜Ryousei Takano
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureJim St. Leger
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Hajime Tazaki
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioHajime Tazaki
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentationAmir Razmjou
 
PASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM AbstractionsPASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM Abstractionsmicchie
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPIJeff Squyres
 
70 лет победы!
70 лет победы!70 лет победы!
70 лет победы!Fintfin
 
Кратко о Rakudo
Кратко о RakudoКратко о Rakudo
Кратко о RakudoAndrew Shitov
 
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics WorkshopLagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics WorkshopLagopus SDN/OpenFlow switch
 
5º Civilización U4º VA: La señora de cao
5º Civilización U4º VA: La señora de cao5º Civilización U4º VA: La señora de cao
5º Civilización U4º VA: La señora de caoebiolibros
 
Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...
Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...
Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...Hirochika Asai
 
X86 hardware for packet processing
X86 hardware for packet processingX86 hardware for packet processing
X86 hardware for packet processingHisaki Ohara
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDKLagopus SDN/OpenFlow switch
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switchmicchie
 
Антифрод система. Кратко
Антифрод система. КраткоАнтифрод система. Кратко
Антифрод система. КраткоMarianna Pavlova
 

Andere mochten auch (20)

Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizations
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
Deep C
Deep CDeep C
Deep C
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentation
 
PASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM AbstractionsPASTE: Network Stacks Must Integrate with NVMM Abstractions
PASTE: Network Stacks Must Integrate with NVMM Abstractions
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPI
 
70 лет победы!
70 лет победы!70 лет победы!
70 лет победы!
 
Кратко о Rakudo
Кратко о RakudoКратко о Rakudo
Кратко о Rakudo
 
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics WorkshopLagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
Lagopus presentation on 14th Annual ON*VECTOR International Photonics Workshop
 
Java - основы языка
Java - основы языкаJava - основы языка
Java - основы языка
 
5º Civilización U4º VA: La señora de cao
5º Civilización U4º VA: La señora de cao5º Civilización U4º VA: La señora de cao
5º Civilización U4º VA: La señora de cao
 
Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...
Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...
Poptrie: A Compressed Trie with Population Count for Fast and Scalable Softwa...
 
Java 9 - кратко о новом
Java 9 -  кратко о новомJava 9 -  кратко о новом
Java 9 - кратко о новом
 
X86 hardware for packet processing
X86 hardware for packet processingX86 hardware for packet processing
X86 hardware for packet processing
 
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
 
mSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software SwitchmSwitch: A Highly-Scalable, Modular Software Switch
mSwitch: A Highly-Scalable, Modular Software Switch
 
Антифрод система. Кратко
Антифрод система. КраткоАнтифрод система. Кратко
Антифрод система. Кратко
 

Ähnlich wie Userspace networking

High perf-networking
High perf-networkingHigh perf-networking
High perf-networkingmtimjones
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...PROIDEA
 
Webcast: Reduce latency, improve analytics and maximize asset utilization in ...
Webcast: Reduce latency, improve analytics and maximize asset utilization in ...Webcast: Reduce latency, improve analytics and maximize asset utilization in ...
Webcast: Reduce latency, improve analytics and maximize asset utilization in ...Emulex Corporation
 
100G Networking Berlin.pdf
100G Networking Berlin.pdf100G Networking Berlin.pdf
100G Networking Berlin.pdfJunZhao68
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpJames Denton
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)Naoto MATSUMOTO
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerFörderverein Technische Fakultät
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHungWei Chiu
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosBrent Salisbury
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreinside-BigData.com
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors Michelle Holley
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Ontico
 

Ähnlich wie Userspace networking (20)

High perf-networking
High perf-networkingHigh perf-networking
High perf-networking
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)
 
SudheerV_resume_a
SudheerV_resume_aSudheerV_resume_a
SudheerV_resume_a
 
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
Webcast: Reduce latency, improve analytics and maximize asset utilization in ...
Webcast: Reduce latency, improve analytics and maximize asset utilization in ...Webcast: Reduce latency, improve analytics and maximize asset utilization in ...
Webcast: Reduce latency, improve analytics and maximize asset utilization in ...
 
100G Networking Berlin.pdf
100G Networking Berlin.pdf100G Networking Berlin.pdf
100G Networking Berlin.pdf
 
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack UpPushing Packets - How do the ML2 Mechanism Drivers Stack Up
Pushing Packets - How do the ML2 Mechanism Drivers Stack Up
 
uCluster
uClusteruCluster
uCluster
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)In-Network Acceleration with FPGA (MEMO)
In-Network Acceleration with FPGA (MEMO)
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
slides
slidesslides
slides
 
Mina2
Mina2Mina2
Mina2
 
OpenStack and OpenFlow Demos
OpenStack and OpenFlow DemosOpenStack and OpenFlow Demos
OpenStack and OpenFlow Demos
 
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and moreAdvanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
Advanced Networking: The Critical Path for HPC, Cloud, Machine Learning and more
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
Building efficient 5G NR base stations with Intel® Xeon® Scalable Processors
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
 

Mehr von Stephen Hemminger

Performance challenges in software networking
Performance challenges in software networkingPerformance challenges in software networking
Performance challenges in software networkingStephen Hemminger
 
Netem -emulating real networks in the lab
Netem -emulating real networks in the labNetem -emulating real networks in the lab
Netem -emulating real networks in the labStephen Hemminger
 
Linux Bridging: Teaching an old dog new tricks
Linux Bridging: Teaching an old dog new tricksLinux Bridging: Teaching an old dog new tricks
Linux Bridging: Teaching an old dog new tricksStephen Hemminger
 
Taking the Fear Out of Contributing
Taking the Fear Out of ContributingTaking the Fear Out of Contributing
Taking the Fear Out of ContributingStephen Hemminger
 
Integrating Linux routing with FusionCLI™
Integrating Linux routing with FusionCLI™Integrating Linux routing with FusionCLI™
Integrating Linux routing with FusionCLI™Stephen Hemminger
 
Virtual Network Performance Challenge
Virtual Network Performance ChallengeVirtual Network Performance Challenge
Virtual Network Performance ChallengeStephen Hemminger
 

Mehr von Stephen Hemminger (13)

Performance challenges in software networking
Performance challenges in software networkingPerformance challenges in software networking
Performance challenges in software networking
 
Staging driver sins
Staging driver sinsStaging driver sins
Staging driver sins
 
Netem -emulating real networks in the lab
Netem -emulating real networks in the labNetem -emulating real networks in the lab
Netem -emulating real networks in the lab
 
Untold story
Untold storyUntold story
Untold story
 
Llnw bufferbloat
Llnw bufferbloatLlnw bufferbloat
Llnw bufferbloat
 
Bufferbloat is alll Wet!
Bufferbloat is alll Wet!Bufferbloat is alll Wet!
Bufferbloat is alll Wet!
 
Linux Bridging: Teaching an old dog new tricks
Linux Bridging: Teaching an old dog new tricksLinux Bridging: Teaching an old dog new tricks
Linux Bridging: Teaching an old dog new tricks
 
Taking the Fear Out of Contributing
Taking the Fear Out of ContributingTaking the Fear Out of Contributing
Taking the Fear Out of Contributing
 
Integrating Linux routing with FusionCLI™
Integrating Linux routing with FusionCLI™Integrating Linux routing with FusionCLI™
Integrating Linux routing with FusionCLI™
 
Virtual Network Performance Challenge
Virtual Network Performance ChallengeVirtual Network Performance Challenge
Virtual Network Performance Challenge
 
A Baker's dozen of TCP
A Baker's dozen of TCPA Baker's dozen of TCP
A Baker's dozen of TCP
 
Virtual net performance
Virtual net performanceVirtual net performance
Virtual net performance
 
Online tools
Online toolsOnline tools
Online tools
 

Kürzlich hochgeladen

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Userspace networking

  • 1. Networking in Userspace Living on the edge Stephen Hemminger stephen@networkplumber.org
  • 2. Problem Statement 20,000,000 Packets per second (bidirectional) 15,000,000 10,000,000 5,000,000 0 64 208 352 496 640 784 928 1072 121613601504 Packet Size (bytes) Intel: DPDK Overview
  • 3. Server vs Infrastructure Server Packets Network Infrastructure Packet Size 1024 bytes Packet Size 64 bytes Packets/second 1.2 Million Packets/second 14.88 Million Arrival rate 67.2 ns Arrival rate 835 ns 2 GHz Clock 135 cycles cycles 2 GHz 1670 cycles 3 Ghz Clock 201 cycles 3 Ghz 2505 cycles cycles L3 hit on Intel® Xeon® ~40 cycles L3 miss, memory read is (201 cycles at 3 GHz)
  • 5.
  • 7. Good old sockets Flexible, portable but slow
  • 8. Memory mapped buffers Efficient, but still constrained by architecture
  • 10. The OpenOnload architecture Network hardware provides a user-safe interface which can route Ethernet packets to an application context based on flow information contained within headers Kernel Application Application Context Context Context Application Application Application Protocol Protocol Driver Network Driver DMA No new protocols DMA Network Adaptor Slide 7
  • 11. The OpenOnload architecture Protocol processing can take place both in the application and kernel context for a given flow Kernel Application Application Context Context Context Application Application Application Protocol Protocol Enables persistent / asynchronous processing Driver Network Driver Maintains existing network control-plane DMA DMA Network Adaptor Slide 8
  • 12. The OpenOnload architecture Protocol state is shared between the kernel and application contexts through a protected shared memory communications channel Kernel Application Application Context Context Context Application Application Application Protocol Protocol Enables correct handling of protocol state with high-performance Driver Network Driver DMA DMA Network Adaptor Slide 9
  • 13. Performance metrics Overhead – Networking overheads take CPU time away from your application Latency – Holds your application up when it has nothing else to do – H/W + flight time + overhead Bandwidth – Dominates latency when messages are large – Limited by: algorithms, buffering and overhead Scalability – Determines how overhead grows as you add cores, memory, threads, sockets etc. Slide 11
  • 14. Anatomy of kernel-based networking Slide 12
  • 16. Direct & safe hardware access Slide 14
  • 17. Some performance results Test platform: typical commodity server – Intel clovertown 2.3 GHz quad-core xeon (x1) 1.3 GHz FSB, 2 Gb RAM – Intel 5000X chipset – Solarflare Solarstorm SFC4000 (B) controller, CX4 – Back-to-back – RedHat Enterprise 5 (2.6.18-8.el5) Slide 88
  • 18. Performance: Latency and overhead TCP ping-pong with 4 byte payload 70 byte frame: 14+20+20+12+4 ½ round-trip latency CPU overhead (microseconds) (microseconds) Hardware 4.2 -- Kernel 11.2 7.0 Onload 5.3 1.1 Slide 89
  • 20. Performance: UDP transmit Nessage rate: – 4 byte UDP payload (46 byte frame) Kernel Onload 1 sender 473,000 2,030,000 Slide 93
  • 21. Performance: UDP transmit Nessage rate: – 4 byte UDP payload (46 byte frame) Kernel Onload 1 sender 473,000 2,030,000 2 senders 532,000 3,880,000 Slide 94
  • 23. OpenOnload Open Source OpenOnload available as Open Source (GPLv2) – Please contact us if you’re interested Compatible with x86 (ia32, amd64/emt64) Currently supports SMC10GPCIe-XFP and SMC10GPCIe-10BT NICs – Could support other user-accessible network interfaces Very interested in user feedback – On the technology and project directions Slide 100
  • 24. Netmap http://info.iet.unipi.it/~luigi/netmap/ ● BSD (and Linux port) ● Good scalability ● Libpcap emulation
  • 26. Netmap API ● Access – open("/dev/netmap") – ioctl(fd, NIOCREG, arg) – mmap(..., fd, 0) maps buffers and rings ● Transmit – fill up to avail buffers, starting from slot cur. – ioctl(fd,NIOCTXSYNC) queues the packets ● Receive – ioctl(fd,NIOCRXSYNC) reports newly received packets – process up to avail buffers, starting from slot cur. These ioctl()s are non-blocking.
  • 27. Netmap API: synchronization ● poll() and select(), what else! – POLLIN and POLLOUT decide which sets of rings to work on – work as expected, returning when avail>0 – interrupt mitigation delays are propagated up to the userspace process
  • 28. Netmap: multiqueue ● Of course. – one netmap ring per physical ring – by default, the fd is bound to all rings – ioctl(fd, NIOCREG, arg) can restrict the binding to a single ring pair – multiple fd's can be bound to different rings on the same card – the fd's can be managed by different threads – threads mapped to cores with pthread_setaffinity()
  • 29. Netmap and the host stack ● While in netmap mode, the control path remains unchanged: – ifconfig, ioctl's, etc still work as usual – the OS still believes the interface is there ● The data path is detached from the host stack: – packets from NIC end up in RX netmap rings – packets from TX netmap rings are sent to the NIC ● The host stack is attached to an extra netmap rings: – packets from the host go to a SW RX netmap ring – packets from a SW TX netmap ring are sent to the host – these rings are managed using the netmap API
  • 32. Netmap Summary Packet Forwarding Mpps Freebsd bridging 0.690 Netmap + libpcap 7.500 Netmap 14.88 Open vSwitch Mpps userspace 0.065 linux 0.600 FreeBSD 0.790 FreeBSD+netmap/pcap 3.050
  • 34. The Intel® DPDK Philosophy Intel® DPDK Fundamentals • Implements a run to completion model or pipeline model • No scheduler - all devices accessed by polling • Supports 32-bit and 64-bit with/without NUMA • Scales from Intel® Atom™ to Intel® Xeon® processors • Number of Cores and Processors not limited • Optimal packet allocation across DRAM channels Control Plane Data Plane • Must run on any IA CPU Provide software examples that ‒ From Intel® Atom™ processor to the address common network latest Intel® Xeon® processor family performance deficits ‒ Essential to the IA value proposition ‒ Best practices for software architecture ‒ ‒ Tips for data structure design and storage • Focus on the fast-path ‒ Help the compiler generate optimum code ‒ Sending large number of packets to the ‒ Address the challenges of achieving 80 Linux Kernel /GPOS will bog the system down Mpps per CPU Socket 20 Intel Restricted Secret TRANSFORMING COMMUNICATIONS TRANSFORMING COMMUNICATIONS
  • 35. Intel® Data Plane Development Kit (Intel® DPDK) Intel® DPDK embeds optimizations for Intel® DPDK Libraries the IA platform: - Data Plane Libraries and Optimized NIC Customer Drivers in Linux User Space Buffer Management Application Queue/Ring Functions Customer - Run-time Environment Application Packet Flow Classification - Environment Abstraction Layer and Boot Code Customer NIC Poll Mode Library Application - BSD-licensed & source downloadable from Intel and leading ecopartners Environment Abstraction Layer User Space Kernel Space Environment Abstraction Layer Linux Kernel Platform Hardware 21 Intel Restricted Secret TRANSFORMING COMMUNICATIONS TRANSFORMING COMMUNICATIONS
  • 36. Intel® DPDK Libraries and Drivers • Memory Manager: Responsible for allocating pools of objects in memory. A pool is created in huge page memory space and uses a ring to store free objects. It also provides an alignment helper to ensure that objects are padded to spread them equally on all DRAM channels. • Buffer Manager: Reduces by a significant amount the time the operating system spends allocating and de-allocating buffers. The Intel® DPDK pre-allocates fixed size buffers which are stored in memory pools. • Queue Manager:: Implements safe lockless queues, instead of using spinlocks, that allow different software components to process packets, while avoiding unnecessary wait times. • Flow Classification: Provides an efficient mechanism which incorporates Intel® Streaming SIMD Extensions (Intel® SSE) to produce a hash based on tuple information so that packets may be placed into flows quickly for processing, thus greatly improving throughput. • Poll Mode Drivers: The Intel® DPDK includes Poll Mode Drivers for 1 GbE and 10 GbE Ethernet* controllers which are designed to work without asynchronous, interrupt- based signaling mechanisms, which greatly speeds up the packet pipeline. 22 Intel Restricted Secret TRANSFORMING COMMUNICATIONS TRANSFORMING COMMUNICATIONS
  • 37. Intel® DPDK Native and Virtualized Forwarding Performance 23 Intel Restricted Secret TRANSFORMING COMMUNICATIONS TRANSFORMING COMMUNICATIONS
  • 38. Comparison Netmap DPDK OpenOnload License BSD BSD GPL API Packet + pcap Packet + lib Sockets Kernel Yes Yes Yes HW support Intel, realtek Intel Solarflare OS FreeBSD, Linux Linux Linux
  • 39. Issues ● Out of tree kernel code – Non standard drivers ● Resource sharing – CPU – NIC ● Security – No firewall – DMA isolation
  • 40. What's needed? ● Netmap – Linux version (not port) – Higher level protocols? ● DPDK – Wider device support – Ask Intel ● Openonload – Ask Solarflare
  • 41. OpenOnload – A user-level network stack (Google tech talk) ● Steve Pope ● David Riddoch ● Netmap - Luigi Rizzo – http://info.iet.unipi.it/~luigi/netmap/talk-atc12.html ● DPDK – Intel DPDK Overview – Disruptive network IP networking ● Naoto MASMOTO