2. Today’s Agenda
• Basic Requirement
• DCA (Direct Cache Access)
• Multiqueue (VMDq and RSS)
Thursday, October 11, 2012
3. Basic Requirement for Packet Processing
• 14.8 Mpps (packets per second) for 10GbE
• 10G / {8 * (64+8+12)}
• Processing time 67 nsec for a packet
• About 134 cycles for Xeon 2GHz
Thursday, October 11, 2012
4. Hardware and Software
CPU
Core
Core
Core
Core
Multi Core
Memory
Cache
DCA
Interrupt Coalescing
MSI-X Posed Interrupt
DMA Full APIC Virtualization
RSS IOMMU
TSO VMDq
LRO Multi Queue NIC
Thursday, October 11, 2012
5. DCA (Direct Cache Access)
• Feature: Put the data directly into the cache
• Reduce memory traffic
• Improve latency
• VT-c ....
• Hard to determine which CPU/chipset/NIC/firmware
supports the feature
• First platform: Xeon 5100, 7300
• Now, Intel Data Direct I/O Technology..
• TPH (TLP Processing Hints)
• PCI Express 2.1 Protocol Extensions
• Steering Tags (8 bits)
Thursday, October 11, 2012
6. recap: DMA [Device Write]
Cache
Maintain Cache Coherent!
⑥
M→E→I→E
② ③
Memory
Controller
Memory
① Memory write by NIC
④ ② Snoop system cache
PCIe RC ③ If Modified state, Writeback by CPU
(Transits to Exclusive state)
④ Device Write to memory
① (Transits to Invalid state)
⑤
⑤ Interrupt
NIC ⑥ Software reads DMA data
(Transits to Exclusive state)
Thursday, October 11, 2012
7. DCA [Device Write]
Cache
M→E→M
② ③
Memory
Controller
Memory
① Memory write by NIC
② Snoop system cache
PCIe RC
④
③ If Modified state, Writeback by CPU
(Transits to Exclusive state)
④ Device Write to cache
① (Transits to Modified state)
⑤
⑤ Software reads DMA data
NIC Keeps Modified state as much as possible
Thursday, October 11, 2012
9. Workload for DCA
• Possibility that cache line modified by DCA
is evicted by writeback before it is read
• Depends on workloads
Thursday, October 11, 2012
10. Virtualization for Networking
VM VM
Virtual Driver Virtual Driver
Virtual HW Virtual HW
Virtual I/F Virtual I/F
Forward Ethernet frame
Resource reservation Virtual Switch
Header inspection
Physical Driver
Physical NIC
Thursday, October 11, 2012
11. Virtualization for Networking
VMDq
VM VM
Virtual Driver Virtual Driver
Virtual HW Virtual HW
Virtual I/F Virtual I/F
Forward Ethernet frame
Resource reservation Virtual Switch
Header inspection
Physical Driver
- Packet sorting
Physical NIC - Moving data to VM
- Routing packets to proper
CPU for receive
Thursday, October 11, 2012
12. Multiqueue (VMDq, RSS)
• When reading the source code of ixgbe,
relationship between VMDq, RSS, DCA and
multi queue is not clear (for me)
• VT-c, again...
• Mixed terminology for feature and marketing
• Let’s clarify with datasheet
• Only focus on Intel 82599 (Niantic)
Thursday, October 11, 2012
13. Queues in 82599 Non-Virtualization
128 Receive Queue
128 Transmit Queue
16 RSS Queues
Thursday, October 11, 2012
14. Queues in 82599 Virtualization
RX TX RX TX RX TX
#Pools * #Queue_Pair = 128
RX TX RX TX RX TX
RX TX RX TX RX TX Without RSS:
RX TX RX TX RX TX 16 pools x 1 queue
RX TX
RX TX RX TX
32 pools x 1 queue
RX TX RX TX RX TX
RX TX RX TX RX TX
64 pools x 1 queue
QP (Queue Pair) With RSS:
Pool 32 pools x 4 RSS
128 Queue Pairs
64 pools x 2 RSS
VM0 VM1 VM2 VM63
2 QPs 2 QPs 2 QPs 2 QPs
Pool 0 Pool 1 Pool 2 Pool 63
VMDq L2 Sorter/Classifier Switch
Thursday, October 11, 2012
15. VMDq and RSS
• RSS is not supported in IOV mode (case of
82599)
• Supported in VMDq mode
• NetQueue in VMware ESX
Thursday, October 11, 2012