Internet is an ever-growing network. The network equipment has to be improved to cope with this growth, including those devices used to classify the network traffic. Internet service providers and network operators require to apply different QoS policies for specific protocols. Then, such classifying systems are critical. However, classification by port does not provide good results, and it is necessary to apply other more complex techniques. These classification techniques have to be fast enough to work at line rates. This paper presents a system that unifies the entire process involved in flow classification at high speed. It captures the traffic, builds flows from the received packets and finally classifies them inside a GPU. All the process is possible at 10 Gbps using commodity hardware. Our results show that the achieved performance is very influenced by the number of protocols to find, and it is limited by the number of network flows. In any case, our system reaches up to 24.4 Gbps using commodity hardware.
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Multimedia flow classification at 10 Gbps using acceleration techniques on commodity hardware
1. Multimedia flow classification at 10
Gbps using acceleration techniques on
commodity hardware
Rafael Leira1, Pedro Gómez1, Iván González1,2
Jorge E. López de Vergara1,2
<jorge@naudit.es>
1Universidad Autónoma de Madrid, Spain
2Naudit High Performance Computing and Networking, Spain
First International Workshop on Quality Monitoring,
SaCoNeT, 17th June 2013, Paris, France
2. Contents
Introduction
Related Work
Architecture
Performance and validation tests
Conclusions and future work
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 2
3. Introduction
The Internet size is growing and changing day by
day, with more and more servers, protocols and end
users.
Many businesses require classify such traffic.
For instance:
» To identify network intrusions
» To provide QoS (Quality of Service).
» To filter out protocols within a subnet
For this purpose, a probe is required that can
support a sustained high throughput.
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 3
4. Related Work
Nowadays there are several methods for traffic
classification, commercial probes and different
mechanisms that implement traffic classification.
The most commonly methods are:
» Classification by port.
» DPI (Deep Packet Inspection) classification.
» Statistical classification.
Commercial probes can afford more than 20
Gbps, giving many optional features. However, those
probe are extremely expensive.
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 4
5. Architecture
The implemented probe has the following modules:
» Network module : Intel DPDK is used to capture the traffic
at 10 Gbps. It also provides a flexible and scalable
architecture, useful for parallel programming.
» FlowBuilder module : The module is designed to group
packets in different flows. It also provides statistics about
every flow inside the network, in a similar way as Cisco
NetFlow.
» GPU classification module : This module performs the
flow classification within a GPU. It maximizes
performance, pipelining the analysis of different flows.
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 5
6. Architecture – Network and
FlowBuilder modules
Nic
0
Nic
1
I/O RX
0
Worker
0
I/O TX
0
Nic
1
Nic
0
FlowBuilder (Worker 0)
process_packet export_flow
A flow
expires
If there is no expired flow,
the function ends.
GPU
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 6
7. Architecture – Flow Builder
export_flow
The flow payload
is copied to
the buffer
return
There is
empty space
GPU Classification
The flow buffer
is copied into
GPU memory
The buffer
is full of flows
The results
are returned
Get the
current
active
buffer
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 7
8. Architecture – GPU classification
1) A flow block enters inside the
module.
2) The flow block is copied inside the
GPU memory.
3) The GPU transposes the block.
4) The flows are classified in parallel.
5) The results are transposed again.
6) The results are also reduced in order to
minimize data transfers.
7) The flow block is returned to the Host
memory.
8) Finally, the results can be stored or relayed.
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 8
9. Performance Tests
Network module
» Using a hardware traffic generator we have achieved to capture
14.9 Million packets per second without packet loss.
Network and FlowBuilder integration
» Performance tests show a
high dependency on the
number of concurrent
flows inside the link.
» However, the flow builder
can support a typical
number of concurrent flows
in a 10 Gbps link.
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 9
12. Validation Tests
Flows matching ratings
» TCP syn 145 / 576
(25,17%) (without
content)
» ICMP 116 / 576 (20,46%)
» NTP 5 / 576 (0.87%)
» RTCP 118 / 576 (20.49%)
» RTP 143 / 576 (24.83%)
» Unknown 49 / 576
(8.51%)
Classification accuracy highly
depends on each protocol
signature.
Binary protocols represent a major
problem to define a signature for
them.
According to the protocol, it can be
impossible to define a signature
without a loss of accuracy.
The used trace was captured in
Alcatel-Lucent premises, in the
scope of the IPNQSIS project.
The unknown flows are minor
protocols that are not important in
multimedia classification. In
fact, they are network-booting
protocols.
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 12
13. Conclusions
Network Module: it allows a flexible and parallel architecture, and
it takes advantage of the hardware capabilities in a simple way.
FlowBuilder Module: it is very dependent on the number of flows
inside a link. It can support up to 10 Gbps without packet losses.
GPU classification Module: it reduces CPU consumption. It highly
depends on the quality of the signatures: problem with binary
protocols such as RTP.
In summary, the system can process and classify a 10 Gbps data
rate in normal traffic conditions.
It has been difficult to validate the system, feeding it at this speed.
Future work
» GPU performance has to be improved in order to minimize the impact of
multiple signatures.
» The FlowBuilder module has to be improved too in order to support
higher performance ( > 20 Gbps )
Multimedia flow classification at 10 Gbps using
acceleration techniques on commodity hardware 13
14. Multimedia flow classification at 10
Gbps using acceleration techniques on
commodity hardware
Rafael Leira1, Pedro Gómez1, Iván González1,2
Jorge E. López de Vergara1,2
<jorge@naudit.es>
1Universidad Autónoma de Madrid, Spain
2Naudit High Performance Computing and Networking, Spain
First International Workshop on Quality Monitoring,
SaCoNeT, 17th June 2013, Paris, France