Jose Saldana, Ignacio Forcen, Julian Fernandez-Navajas, Jose Ruiz-Mas, "Improving Network Efficiency with Simplemux,'' IEEE CIT 2015, International Conference on Computer and Information Technology, 26-28 October 2015 in Liverpool, UK. (http://cse.stfx.ca/~cit2015/)
Presentation the paper http://diec.unizar.es/~jsaldana/personal/chicago_CIT2015_in_proc.pdf
Abstract
The high amount of small packets currently transported by IP networks results in a high overhead, caused by the significant header-to-payload ratio of these packets. In addition, the MAC layer of wireless technologies makes a non-optimal use of airtime when packets are small. Small packets are also costly in terms of processing capacity. This paper presents Simplemux, a protocol able to multiplex a number of packets sharing a common network path, thus increasing efficiency when small packets are transported. It can be useful in constrained scenarios where resources are scarce, as community wireless networks or IoT. Simplemux can be seen as an alternative to Layer-2 optimization, already available in 802.11 networks. The design of Simplemux is presented, and its efficiency improvement is analyzed. An implementation is used to carry out some tests with real traffic, showing significant improvements: 46% of the bandwidth can be saved when compressing voice traffic; the reduction in terms of packets per second in an Internet trace can be up to 50%. In wireless networks, packet grouping results in a significantly improved use of air time.
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
Improving Network Efficiency with Simplemux
1. Improving Network Efficiency
with Simplemux
Jose Saldana, Ignacio Forcén, Julián Fernández-Navajas, José Ruiz-Mas
University of Zaragoza
IEEE CIT-2015
26 October 2015, Liverpool, UK
jsaldana@unizar.es
1
This work has been partially
financed by the EU H2020 Wi-5
project, http://www.wi5.eu/
(Grant Agreement no: 644262)
3. 26 Oct 2015
Overhead drawback in packet-switched networks
Small-packet services
Introduction
3
IPv4/TCP packet 1500 bytes
η=1460/1500=97% (IP level) 1460/1542=94% (Eth level)
IPv4/UDP/RTP packet of VoIP with two samples of 10 bytes
η=20/60=33% (IP level) 20/102=19% (Eth level)
IPv4/UDP client-to-server packets of Counter Strike (online game)
η=61/89=68% (IP level) 61/131=46% (Eth level)
IPv4 header: 20 bytes
UDP header: 8 bytes
Inter-frame gap: 12 bytes
Eth header: 26 bytes
Eth FCS: 4 bytes
Payload
RTP header: 12 bytes
4. 26 Oct 2015
Small packets in the public Internet (trace from CAIDA.org *)
0
10000
20000
30000
40000
50000
60000
70000
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
Numberofpackets
Packet size [bytes]
Packet size histogram Chicago 2015
Introduction
4
Source: https://data. caida.org/datasets/passive-2015/equinix-chicago/20150219-130000.UTC/equinix-
chicago.dirA.20150219-125911.UTC.anon.pcap.gz. Only first 200,000 packets used
44% packets are 1440
bytes or more
33% packets are
60 bytes or less
Average: 782 bytes
5. 26 Oct 2015
Introduction
Airtime drawback: Medium access is performed before sending data
Additional delays and air time inefficiency
5
RTSDIFS
SIFS CTS
SIFS Data
SIFS ACK
NAV (RTS)
NAV (CTS)
NAV (Data)
Sender
Receiver
Other nodes
DIFS: DCF Inter Frame Space
SIFS: Short Interframe Space
RTS: Request To Send
CTS: Clear To Send
NAV: Network Allocation Vector
DCF: Distributed Coordination Function
6. 26 Oct 2015
Introduction
Processing capacity drawback: Limit in the pps a device can manage
e.g. commercial switch. For 100-byte packets, the throughput of the switch would in fact
be limited to 8.08 Gbps.
Measurements in terms of
64-byte packets throughput
6
7. 26 Oct 2015
• Simplemux: a tunnel of multiplexed packets
• Generic and lightweight multiplexing protocol
• Different tunneling and multiplexed protocols allowed
• Works with Layer-3 packets
• Submitted to the IETF as draft-saldana-tsvwg-simplemux-03
• Combined with header compression, savings can be huge (shared tunnel overhead)
Simplemux separator (1-3 bytes)
Tunneling header
Muxed protocol header
Protocol=
Simplemux Protocol=any
Introduction
8. 26 Oct 2015
• Simplemux may work between a pair of intermediate nodes
• In principle, it is not aimed for multiplexing in the end host
• The optimization covers a number of hops
• The intermediate nodes do not have to implement Simplemux, they just route packets
Introduction
Native
Simplemux
ingress egress
Common network segment
10. 26 Oct 2015
Frame grouping in 802.11 networks
New versions of Wi-Fi (from 802.11n) include mechanisms for frame grouping:
A-MPDU and A-MSDU.
Aggregation format is compulsory in 802.11ac.
10
Source: Ginzburg, B.; Kesselman, A., "Performance analysis of A-MPDU and A-MSDU aggregation in IEEE 802.11n," Sarnoff Symposium, 2007 IEEE ,
vol., no., pp.1,5, April 30 2007-May 2 2007
The maximum number of frames in an A-MPDU is 64. The maximal A-MSDU size is 7935 bytes and thus it may contain at most 5 frames.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60
Efficiency(datarate/PHYrate)
Number of grouped frames (1500 bytes)
Efficiency (PHY rate=130Mbps)
A-MPDU, UDP
A-MPDU, TCP
A-MSDU, UDP
A-MSDU, TCP
N
CS
MPDU
len
CRC
signat
ure
MPDU padding
Subframe 1 Subframe 2 Subframe N...
MPDU delimiter (4) variable 0-3
PHYHDR A-MPDU
PSDU
Size (bytes):
reserved
A-MPDU aggregation
11. 26 Oct 2015
IETF protocols for packet grouping
TMux [RFC1692] multiplexes a number of TCP segments between
the same pair of machines.
PPPMux [RFC3153] is able to multiplex complete IP packets, using
separators. It requires the use of PPP and L2TP.
Simplemux
11
IP
IP TCP payload IP TCP payload
TCP payload TCP payload
TMux TMux
IP IP
IP TCP payload IP TCP payload
TCP payload IP TCP payload
tunnel
L2TP
PPP PPPMux PPPMux
IP IP
IP TCP payload IP TCP payload
TCP payload IP TCP payload
tunnel First Simplemux header Non-first Simplemux header
13. 26 Oct 2015
Scenarios: Low-bandwidth residential access
Collaboration router-network operator may save bandwidth in a limited access
network
13
ISP network Internet
Transport
network
Internet Router
DSLAM BRAS
xDSL router
With multiplexer
embedded
xDSL router
Without multiplexer
Home network with a number
of users (e.g. Internet Café,
access shared by neighbors)
Individual DSL
14. 26 Oct 2015
Scenarios: Wireless Community Network
- Optimization can be activated when required
- Substituting 802.11 aggregation in legacy devices (prior to 802.11n)
- Optimization covers a number of hops (in 802.11 it covers only one)
14
Internet gateway
Internet
Village with
public Wi-Fi
xDSL router
Without multiplexer
Wi-Fi
Wi-Fi
Wi-Fi
Wi-Fi
Wi-Fi
W
i-Fi
Operator
3G femtocell
Remote village
3G
Operator PoP
16. 26 Oct 2015
Protocol design
- A separator has to include:
- Length of the next packet (try to keep the number of bytes short)
- Protocol ID of the next packet
16
*http://datatracker.ietf.org/doc/draft-saldana-tsvwg-simplemux/
IPv4 packet ROHC packetIPv6 packet
Tunneling
header
Protocol=
Simplemux
Protocol=
IP
Protocol=
IP
Protocol=
ROHC
Simplemux headers/separators
19. 26 Oct 2015
Implementation details
- An implementation has been built combining (TCM)
- Header Compression: ROHC (https://rohc-lib.org/)
- Multiplexing: Simplemux (https://github.com/TCM-TF/simplemux)
- Tunneling: IPv4 (raw sockets) or UDP
- An upper bound for the delay can be set
- 2700 lines of code (ROHC not included)
19
Native traffic: Five IPv4/UDP/RTP VoIP packets with two samples of 10 bytes
Optimized traffic (network mode): One IPv4 simplemux Packet including five RTP packets
IPv4 header: 20 bytes Tunnel IP header: 20 bytes
saving
Native traffic headers: Optimized traffic headers:
Optimized traffic (transport mode): One IPv4 simplemux Packet including five RTP packets
saving
20. 26 Oct 2015
Test scenario
- 4 PCs with Intel i3, Debian 6
- Eth switches 100 Mbps
- Wi-Fi at 5 GHz, 9 Mbps
- D-ITG traffic generator
20
Eth switchEth switchEth switch
End-to-end flows
Optimized flows
generator receivermultiplexer demultiplexer
802.11ac
21. 26 Oct 2015
Generic Internet trace
- iptables is used to multiplex small packets (limit 200 bytes)
21
0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
packetsize[bytes]
time [s]
Native packets
22. 26 Oct 2015
- 7,908 packets travel into 395 multiplexed ones
0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
packetsize[bytes]
time [s]
Packets after the optimization
Generic Internet trace
22
23. 26 Oct 2015
- 50 % pps reduction
- Only 5% bandwidth reduction
Generic Internet trace
23
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
100 300 500 700 900 1100 1300 1500
Packetspersecondrelationship
Size limit for sending packets to the multiplexer [bytes]
Packets per second relationship when multiplexing
24. 26 Oct 2015
RTP VoIP
5 VoIP calls sharing an Ethernet link (RTP with different codecs)
24
10%
15%
20%
25%
30%
35%
40%
45%
50%
20 40 60 80 100 120 140 160 180
BWsaving
Multiplexing period [ms]
Bandwidth Saving. 5 RTP calls
GSM
iLBC20ms
iLBC30ms
PCM-A, PCM-U
25. 26 Oct 2015
Packet loss reduction in a saturated 802.11 link
Link: 802.11ac. 5.56GHz. 9Mbps
Offered traffic (IP level): 15,000 pps * 60 bytes = 7.2 Mbps
25
0%
10%
20%
30%
40%
50%
60%
70%
80%
native 1 2 3 4 5 10 15 20 25 30
Packetlosspercentage
Number of multiplexed packets
Packet Loss in a Saturated 802.11 Link (60-byte packets)
no ROHC
ROHC
27. 26 Oct 2015
Conclusions
Main idea of Simplemux
Join small packets into bigger ones
Reduce the amount of packets
Amortize the tunnel overhead between a higher number of payloads
Eventually compress packets (e.g. with header compression)
Traffic optimization may provide
Bandwidth savings (and airtime in wireless networks) for small-packet flows
pps reduction, even for generic Internet traffic
Energy savings
Flexibility: optimization can be activated when required. Avoid dimensioning the
network for the worst case
At what cost?
CPU
I can multiplex packets already in the buffer, but additional buffering delay may be
used for increasing the multiplexing rate
27
28. Thanks a lot!
Jose Saldana, Ignacio Forcén, Julián Fernández-Navajas, José Ruiz-Mas
University of Zaragoza
IEEE CIT-2015
26 October 2015, Liverpool, UK
jsaldana@unizar.es
28
This work has been partially
financed by the EU H2020 Wi-5
project, http://www.wi5.eu/
(Grant Agreement no: 644262)
30. 26 Oct 2015
Overhead in wired networks (Ethernet)
30
0
100
200
300
400
500
600
700
800
900
1,000
64 264 464 664 864 1064 1264 1464
MaximumThroughput[Mbps]
Frame size [bytes]
Maximum Ethernet Throughput (link speed 1Gbps)
TCP Payload efficiency
Eth payload efficiency
Source: Small Packet Traffic Performance Optimization for 8255x and 8254x Ethernet Controllers,
http://www.intel.com/content/dam/doc/application-note/8255x-8254x-ethernet-controllers-small-packet-traffic-performance-appl-note.pdf
Average: 782 bytes
31. 26 Oct 2015
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 250 500 750 1000 1250 1500 1750 2000 2250 2500
Efficiency(datarate/PHYrate)
Packet size [bytes]
Efficiency (PHY rate=54 Mbps)
1 MPDU, UDP
1 MPDU, TCP
Overhead in 802.11 networks
31
Average: 782 bytes
Source: Model developed by Ginzburg, B.; Kesselman, A., "Performance analysis of A-MPDU and A-MSDU aggregation in
IEEE 802.11n," Sarnoff Symposium, 2007 IEEE , vol., no., pp.1,5, April 30 2007-May 2 2007
32. 26 Oct 2015
Frame grouping in 802.11 networks
New versions of Wi-Fi (from 802.11n) include mechanisms for frame grouping:
A-MPDU and A-MSDU.
A-MPDU: a number of MPDU delimiters each followed by an MPDU.
A-MSDU: multiple payload frames share not just the same PHY, but also the same
MAC header.
In 802.11ac all the frames have an A-MPDU format, even with 1 sub-frame
21/10/2015 32
DA SA len MSDU padding
Subframe 1 Subframe 2 Subframe N...
Subframe hdr
6 6 2 0-2304 0-3
PHYHDR MACHDR A-MSDU FCS
PSDU
Size (bytes):
MPDU
len
CRC
signat
ure
MPDU padding
Subframe 1 Subframe 2 Subframe N...
MPDU delimiter (4) variable 0-3
PHYHDR A-MPDU
PSDU
Size (bytes):
reserved
A-MSDU aggregation A-MPDU aggregation
33. 26 Oct 2015
Additional scenario: Machine to machine
33
Internet
Satellite link
Data Center 3
IP sensors
Data Center 2
Data Center 1
Satellite
Terminal
Gateway
Satellite
Terminal
Satellite
Terminal
34. 26 Oct 2015
Implementation details
- Processing delay:
- Commodity PC (i3): 0.25 ms (N=10 packets)
- Low-cost wireless AP (OpenWRT): 3.5 ms (N=10packets)
34
Native traffic: Five IPv4/UDP/RTP VoIP packets with two samples of 10 bytes
Optimized traffic (network mode): One IPv4 simplemux Packet including five RTP packets
IPv4 header: 20 bytes Tunnel IP header: 20 bytes
UDP header: 8 bytes
RTP header: 12 bytes
Tunnel UDP header: 8 bytes
Simplemux separator: 1-3 bytes
ROHC Compressed header (IP/UDP/RTP)Payload
saving
Native traffic headers: Optimized traffic headers:
Optimized traffic (transport mode): One IPv4 simplemux Packet including five RTP packets
saving
35. 26 Oct 2015
Delay limits
- Multiplexing must be carefully done, taking into account the
available delay budget
- It adds no delay if a number of packets are waiting in the buffer (it
may occur in congested links)
- Limit the additional buffering delays caused by packet grouping
- VoIP (150 ms)
- Online games (100 to 1000 ms)
- Remote desktop (200 ms)
- IoT samples (Constrained network scenarios, core)
- A draft surveying these limits is available*
35
* Delay Limits and Multiplexing Policies to be employed with Tunneling Compressing and Multiplexing
Traffic Flows
http://datatracker.ietf.org/doc/draft-suznjevic-tsvwg-mtd-tcmtf/