Weitere ähnliche Inhalte
Ähnlich wie BitVisor Summit 8「3. AQC107 Driver and Changes coming to network API」 (20)
Kürzlich hochgeladen (20)
BitVisor Summit 8「3. AQC107 Driver and Changes coming to network API」
- 1. AQC107 Driver and Changes
coming to network API
2019/12/12 @ BitVisor Summit 8
Ake Koomsin
- 2. Agenda
◼ Strange issues during AQC107 driver development
◼ AQC107 driver performance
◼ Changes coming to network API
1Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 3. Aquantia AQC107
◼ Budget 10 Gbps Ethernet Controller
– Mac Mini 2018 10 Gbps model
– Asus XG-C100C
– Etc
◼ Support 10/5/2.5/1 Gbps and 100 Mbps
◼ Support a lot of packet filtering
– L2/L3/L4
– VLAN
– Flexible Header filtering
– Etc
◼ Aquantia is acquired by Marvell in 2019/09
2Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 4. AQC107 strange issues
◼ Busy bit problem on Mac mini
◼ MAC address and Mac mini
3Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 5. AQC107 strange issues
◼ Busy bit problem on Mac mini
– To get a MAC address from AQC107, communicate with its
firmware through the mailbox mechanism
• Write a message to a register, commit, and wait for the result
– Mailbox Busy Bit is used to indicate whether data from the
firmware has arrived or not
• We know that the data has arrived when the bit is cleared
– However, the busy bit is always set for AQC107 on Mac mini
• May be due to the firmware difference? (AQC107 on Mac mini
uses Apple’s firmware)
• Workaround is required
4Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 6. AQC107 strange issues
◼ MAC address and Mac mini
– Typically, a driver obtains a MAC address from the device
– Interestingly, macOS ,and Mac firmware obtain the MAC
address from somewhere else
• See Mac Mini box for MAC address used by macOS, and Mac
Firmware
– To avoid unforeseen problems, it is better to use that the
MAC address that is on the Mac mini box
– The only way to obtain that MAC address programmatically
is to get it from Mac firmware
• From UEFI Device Path
– That is the reason we introduce uefiutil to allow us to
obtain additional information from the firmware
5Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 7. AQC107 driver performance
◼ Test environment
6Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
BitVisor Machine Another Machine
CPU Intel i5-4430 @ 3.0 GHz
Memory 4 GB 8 GB
OS Debian Live CD 10.2, kernel 4.19.67-2+deb10u1
NIC Asus XG-C100C
Link speed 10 Gbps direct connect
Test program Iperf2
- 8. AQC107 driver performance
◼ Up until “virtio-net: try to submit packets
in a batch to the device driver”
◼ Result
– TX: ~9.4 Gbps
– RX: ~6.8 Gbps
◼ There is room for RX improvement
7Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
0 1 2 3 4 5 6 7 8 9 10
RX (Gbps)
Baseline
- 9. AQC107 driver performance
◼ Enabling Receive Side Coalescing (RSC) interrupt
– Coalesce incoming interrupts so that they are not too
excessive
◼ Result
– TX: ~9.4 Gbps
– RX: from ~6.8 to ~7.4 Gbps
8Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
0 1 2 3 4 5 6 7 8 9 10
RX (Gbps)
Baseline Intr RSC
- 10. AQC107 driver performance
◼ Increase virtio_net queue size from 256 to 512
– Reduce packet drop due to out of available descriptors
◼ Result
– TX: ~9.4 Gbps
– RX: from ~7.4 Gbps to ~9.1 Gbps
9Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
0 1 2 3 4 5 6 7 8 9 10
RX (Gbps)
Baseline Intr RSC virtio_net 512
- 11. Changes coming to network API
◼ Dual MAC addresses
◼ Support for hardware offloading
10Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 12. Changes coming to network API
◼ Dual MAC addresses
– Unique MAC addresses for lwip and virtio_net
• Inspect incoming packets, and forward them to the appropriate
destination
– Result
• TX: ~9.4 Gbps
• RX: from ~9.1 Gbps to ~ 9.4 Gbps
11Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
0 1 2 3 4 5 6 7 8 9 10
RX (Gbps)
Baseline Intr RSC virtio_net 512 Dual MAC address
- 13. Changes coming to network API
◼ Support for hardware offloading
– Although we can saturate 10 Gbps TX throughput, it is still
possible to reduce CPU usage caused by:
• Checksum calculation
• Packet segmentation
– Need
• Ability to tell the NIC driver to perform offloading
• Ability to pass buffers to the NIC directly (zero copy)
12Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 14. Changes coming to network API
◼ Support for hardware offloading
– Current network API is not enough
– Tentative upcoming changes
• Unifying send/receive function signature
• Introducing struct netpkt packet descriptor
• TX zero copy implementation
• Support for hardware offloading features like TSO, and
checksum calculation
13Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 15. Changes coming to network API
◼ Unifying send/receive function signature
– Current send/receive function signature
14Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
/* send() is a part of struct nicfunc */
void (*send) (void *handle,
uint num_packets,
void **packets,
uint *packet_sizes,
bool print_ok);
typedef void net_recv_callback_t (void *handle,
uint num_packets,
void **packets,
uint *packet_sizes,
void *param,
long *premap);
- 16. Changes coming to network API
◼ Unifying send/receive function signature
– Current send/receive function signature
• Difference in function signature
– recv() can be used for transmitting data (like in virtio_net)
– Same for send(), it can be used for receiving data
– This can be troublesome when we want to implement TX zero
copy and hardware offloading
15Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 17. Changes coming to network API
◼ Unifying send/receive function signature
– Introduce net_io_fill_t and net_io_flush_t
– net_io_fill_t return value
• NET_FILL_OK
• NET_FILLL_FULL
16Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
typedef int net_io_fill_t (void *handle,
void *packet,
unsigned int packet_size,
bool print_ok,
void *opt);
typedef void net_io_flush_t (void *handle);
- 18. Changes coming to network API
◼ Unifying send/receive function signature
– send() becomes send_fill() and send_flush()
– recv() becomes recv_fill() and recv_flush()
– Return value from fill() gives the caller a chance to deal with
out of descriptor situation
– fill() allows the caller to fill data as much as possible before
flush()
• fill() and flush() should reduce number of MMIO accesses, good
for performance
– flush() usually involves MMIO register accesses if the callee is
NIC driver
17Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 19. Changes coming to network API
◼ Introducing struct netpkt packet descriptor
– We can change net_io_fill_t signature to
where struct netpkt is
18Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
typedef int net_io_fill_t (void *handle,
struct netpkt *pkt,
bool print_ok);
struct netpkt {
void *buf;
void *extra;
u32 buf_nbytes;
u32 flags; /* For options like TSO,
checksum offloading, etc */
};
- 20. Changes coming to network API
◼ TX zero copy implementation
– We can add struct dmabuf so that we can hand over
buffer physical addresses to the NIC
– We also need callback to notify the caller that the packet has
been consumed by the NIC
• Require NIC driver modification
19Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
struct netpkt {
struct dmabuf *dmabuf;
void *extra;
void (*callback) (void *handle,
struct netpkt *pkt,
void *param)
void *cb_handle, *cb_param;
u32 flags; /* For options like TSO,
checksum offloading, etc */
};
- 21. Changes coming to network API
◼ Support for hardware offloading
– After the API is stable, we can add hardware offloading
support
– Modification should be straightforward
20Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.
- 22. Summary
◼ AQC107 driver
– We can saturate TX/RX throughput
– Still possible to reduce CPU usage
◼ Changes coming to network API
– Dual MAC addresses
– Support for hardware offloading
• Unifying send/receive function signature
• Packet descriptor
• TX zero copy support
• TSO + Checksum calculation
21Copyright© 2019 IGEL Co., Ltd. All Rights Reserved.