SlideShare ist ein Scribd-Unternehmen logo
1 von 56
Lessons we learned while building
real-time network traffic analyzer in
C/C++
Alex Moskvin
CEO/CTO @ Plexteq
About myself
• CEO/CTO Plexteq OÜ
• Ph.D in information technology area
• Interests
• Software architecture
• High loaded systems
• Everything under the hood
• AI/ML + BigData
• Knowledge sharing ;)
• Follow me
• https://twitter.com/amoskvin
• https://www.facebook.com/moskvin.aleksey
Plexteq
• High loaded backends
• Complex distributed data processing
pipelines
• Big Data / BI
• We have our custom products
(hardware + software solutions)
We are hiring! ;)
Agenda
1. What was the whole stuff about
2. How we decided to solve it
3. Challenges we faced
4. Lessons we learned
Disclaimer ;)
This talk is based on personal experience.
Use at your own risk.
Task definition
• Network services provider needs:
• Analyse threats/interactions in past
• Realtime network spikes indication
• Aggregate metadata from hundreds of systems
• Solution should be
• fast, resource efficient (no CPU/RAM hogging)
• potentially needs to be cross-platform
• Easy to integrate with ETL and BI systems
• Regular bandwidth: 100-1000Mbps
Data model
2 dimensions
Per port
Time period
Source IP
Destination port
Protocol type
In bytes
Out bytes
In packets
Out packets
Per protocol type
Time period
TCP/UDP/… traffic in
bytes
TCP/UDP/… traffic in
bytes
Protocol type
In bytes
Out bytes
In packets
Out packets
High level architecture
High level architecture
Existing solutions
• tcpdump
• wireshark
• iptables
Existing solutions
$ for i in 1 2 3; do
some tcpdump exercise
done
Existing solutions
$ tcpdump -i eth0
$ tcpdump tcp port 443
$ tcpdump tcp ‘port 443 or port 80’
$ tcpdump tcp ‘port 443 or port 80’ -w out-file
Existing solutions
• Drawbacks
• tcpdump / wireshark
• Single threaded
• Large disk space overhead (without hacking will write packet contents)
• Not possible to write with custom data format (extra parsing efforts of .pcap file is
needed)
• Iptables
• Could work, but will be hard to customize in case of further feature requests
• Not cross-platform
Existing solutions
We want our own bicycle ;)
Main functions
Okay, so we want to capture traffic from the kernel.
How should we do it?
Traffic capturing
• Raw sockets
• pf_ring
• 3rd party libraries
• libtins
• pcapplusplus
• libpcap
Traffic capturing :: Raw sockets
Traffic capturing :: Raw sockets
Drawbacks:
• Kernel-to-userspace copies
• Developer needs to be proficient with
packet structure and low level
networking semantics, i.e. endianness
Traffic capturing :: Raw sockets
Traffic capturing :: pf_ring
PF_RING – kernel bypass
Motivation:
• Kernel is very slow 
• Vanilla kernel can handle 1-2Mpps
• PF_RING can do 15+Mpps on commodity hardware
Pros
• Huge workloads
• Could be used for network server application development
• Zero copy technique
Cons
• Complicated API
• Support on network card driver level is preferred
• PF_RING ZC API is complex
• Not cross platform
Traffic capturing :: 3rd party libs
Pros:
• Cross platform
• May utilize low level OS dependent optimizations and extensions, i.e. PF_RING
Traffic capturing :: winner
libpcap
• Cross platform
• Supports PF_RING
• The most fast implementation
• Well maintained
• Relatively easy API
Traffic capturing :: winner
Traffic capturing :: libpcap
Solutions to store data
We wanted something that:
• Has small footprint and fast
• Preferably one file database
• Embeddable
• Supports SQL
• Supports B-tree indices
Solutions to store data
We wanted something that:
• Has small footprint and fast
• Preferably one file database
• Embeddable
• Supports SQL
• Supports B-tree indices
Solutions to store data
We wanted something that:
• Has small footprint and fast
• Preferably one file database
• Embeddable
• Supports SQL
• Supports B-tree indices
Drawbacks:
• Single threaded – we need to synchronize/serialize write ops to it in our
application
SQLite :: code examples
SQLite :: code examples
We have core tool chain now!
Let’s glue it up together
Packet processing flow
Producer-consumer problem
• Issues:
• Aggregator is not following up on traffic > 25Mbps
• We have a significant increasing delay between incoming traffic and flushed
stats
This is actually a producer-consumer type of problem
Producer-consumer problem
We need to handle packets in
multiple threads
Producer-consumer problem
• Solution:
• Producer runs in separate thread
• Multiple consumers that run in separate threads
Producer-consumer problem
• Solution:
• Producer runs in separate thread
• Multiple consumers that run in separate threads
Possible implementations:
• Message broker
• Blocking queue
Producer-consumer problem
We need a blocking queue
For this purpose
Producer-consumer problem
Very good implementation: APR (Apache Portable Runtime)
Used by Apache web server
http://apr.apache.org/docs/apr-util/1.3/apr__queue_8h.html
Packet processing flow
Packet processing flow
• Issues:
• Application is capable to handle about 82Mbps of traffic flow
• CPU usage is 100+% utilized by our app (eaten by malloc calls)
Memory allocation
• Issues:
• Application is capable to handle about 82Mbps of traffic flow
• CPU usage is 100% utilized by our app (eaten by malloc calls)
• Business logic needed at least 1 malloc when packet stats got aggregated in in-
memory data structure
Malloc issue
Solution:
• Use memory pooling
Malloc issue
Solution:
• Use memory pooling
Blockpre-allocate
withmalloc
Allocations within a block
(eventually allocation within block = pointer arithmetic)
Malloc issue
Solution:
• Use memory pooling
Blockpre-allocate
withmalloc
Allocations within a block
(eventually allocation within block = pointer arithmetic)
Drawbacks:
• Can’t do free for an individual
allocation within a block
Packet processing flow
Some implementations
• APR (https://apr.apache.org/docs/apr/1.6/group__apr__pools.html)
• Mpool (https://github.com/silentbicycle/mpool)
Packet processing flow
Mutexes
• Results:
• Linux:
• Application is capable to handle ~1Gbps of traffic flow
• CPU usage is 10-15% on 4 core Xeon 2.8Ghz
• FreeBSD/OSX
• Application is capable to handle ~615Mbps of traffic flow
• CPU usage is 35% on 4 core Xeon 2.8Ghz
Mutexes
• Results:
• Linux:
• Application is capable to handle ~1Gbps of traffic flow
• CPU usage is 10-15% on 4 core Xeon 2.8Ghz
• FreeBSD/OSX
• Application is capable to handle ~615Mbps of traffic flow
• CPU usage is 35% on 4 core Xeon 2.8Ghz
• Possible reasons
• Profiler shows a high number of thread synchronization calls from our app
(pthread_mutex_lock, pthread_mutex_unlock)
Mutexes
• Investigation:
• pthread_mutex_* in Linux is implemented using futexes (fast user-space
mutex), no locking, no context switching
• POSIX is a standard, it doesn’t require specific implementation
• OSX/FreeBSD use heavier approach with
Mutexes
Mutexes
• Thread synhronization approaches:
• Lock based
• Semaphore
• Mutex
• Lock free
• Futex (could lock in an edge case)
• Spin lock
• CAS based spin lock
Mutexes
• Our target critical section:
• No IO operations
• Just pointer operations, arithmetic operations and allocations on memory
pool
• Options
• Spin lock from OS
• Custom spin lock based on CAS operations
Mutexes
• Our target critical section:
• No IO operations
• Just pointer operations, arithmetic operations and allocations on memory
pool
• Options
• Spin lock from OS
• pthread_spin_lock
• Custom spin lock based on CAS operations
• GCC atomic built ins
• __sync_lock_test_and_set
• __sync_lock_release
Mutexes
Mutexes
1) volatile suggests that “lock”may be changed by other threads
2) __sync_lock_test_and_set, __sync_lock_release
Are atomic built ins which guarantee atomic memory access
3) __sync_lock_test_and_set atomically sets 1 and returns 0
4) If lock == 1, we keep looping until another thread calls
__sync_lock_release
Mutexes
• Results:
• Linux:
• Application is capable to handle ~1Gbps of traffic flow
• CPU usage is 10-15% on 4 core Xeon 2.8Ghz
• FreeBSD/OSX
• Application is capable to handle ~1Gbps of traffic flow
• CPU usage is 8-12% on 4 core Xeon 2.8Ghz
• Possible reasons
• Profiler shows a high number of thread synchronization calls from our app
(pthread_mutex_lock, pthread_mutex_unlock)
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Beyond Apache: Faster Web Servers
Beyond Apache: Faster Web ServersBeyond Apache: Faster Web Servers
Beyond Apache: Faster Web Servers
webhostingguy
 
AMF Flash and .NET
AMF Flash and .NETAMF Flash and .NET
AMF Flash and .NET
Yaniv Uriel
 

Was ist angesagt? (20)

Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
 
Architecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker servicesArchitecture Sustaining LINE Sticker services
Architecture Sustaining LINE Sticker services
 
Fastest Servlets in the West
Fastest Servlets in the WestFastest Servlets in the West
Fastest Servlets in the West
 
Resource Prioritization
Resource PrioritizationResource Prioritization
Resource Prioritization
 
Hands-on Performance Tuning Lab - Devoxx Poland
Hands-on Performance Tuning Lab - Devoxx PolandHands-on Performance Tuning Lab - Devoxx Poland
Hands-on Performance Tuning Lab - Devoxx Poland
 
Monitoring Oracle SOA Suite - UKOUG Tech15 2015
Monitoring Oracle SOA Suite - UKOUG Tech15 2015Monitoring Oracle SOA Suite - UKOUG Tech15 2015
Monitoring Oracle SOA Suite - UKOUG Tech15 2015
 
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
Building a company-wide data pipeline on Apache Kafka - engineering for 150 b...
 
Security Best Practices for your Postgres Deployment
Security Best Practices for your Postgres DeploymentSecurity Best Practices for your Postgres Deployment
Security Best Practices for your Postgres Deployment
 
Beyond Apache: Faster Web Servers
Beyond Apache: Faster Web ServersBeyond Apache: Faster Web Servers
Beyond Apache: Faster Web Servers
 
Monitoring Oracle SOA Suite
Monitoring Oracle SOA SuiteMonitoring Oracle SOA Suite
Monitoring Oracle SOA Suite
 
Mini-Training: Message Brokers
Mini-Training: Message BrokersMini-Training: Message Brokers
Mini-Training: Message Brokers
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
Managing ejabberd Platforms with Docker - ejabberd Workshop #1
Managing ejabberd Platforms with Docker - ejabberd Workshop #1Managing ejabberd Platforms with Docker - ejabberd Workshop #1
Managing ejabberd Platforms with Docker - ejabberd Workshop #1
 
AMF Flash and .NET
AMF Flash and .NETAMF Flash and .NET
AMF Flash and .NET
 
Multi-DC Kafka
Multi-DC KafkaMulti-DC Kafka
Multi-DC Kafka
 
SpringPeople Introduction to Mule ESB
SpringPeople Introduction to Mule ESBSpringPeople Introduction to Mule ESB
SpringPeople Introduction to Mule ESB
 
Keynote Oracle Fusion Middleware Summit_2020
Keynote Oracle Fusion Middleware Summit_2020Keynote Oracle Fusion Middleware Summit_2020
Keynote Oracle Fusion Middleware Summit_2020
 
A vision for ejabberd - ejabberd SF Meetup
A vision for ejabberd - ejabberd SF MeetupA vision for ejabberd - ejabberd SF Meetup
A vision for ejabberd - ejabberd SF Meetup
 
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache FlumeFeb 2013 HUG: Large Scale Data Ingest Using Apache Flume
Feb 2013 HUG: Large Scale Data Ingest Using Apache Flume
 
Fault-Tolerant File Input & Output
Fault-Tolerant File Input & OutputFault-Tolerant File Input & Output
Fault-Tolerant File Input & Output
 

Ähnlich wie Realtime traffic analyser

Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San Francisco
Membase
 

Ähnlich wie Realtime traffic analyser (20)

Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
SC'16 PMIx BoF Presentation
SC'16 PMIx BoF PresentationSC'16 PMIx BoF Presentation
SC'16 PMIx BoF Presentation
 
EKON27-FrameworksTuning.pdf
EKON27-FrameworksTuning.pdfEKON27-FrameworksTuning.pdf
EKON27-FrameworksTuning.pdf
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San Francisco
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...From Device to Data Center to Insights: Architectural Considerations for the ...
From Device to Data Center to Insights: Architectural Considerations for the ...
 
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
 
Adding Real-time Features to PHP Applications
Adding Real-time Features to PHP ApplicationsAdding Real-time Features to PHP Applications
Adding Real-time Features to PHP Applications
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
 
Adding Support for Networking and Web Technologies to an Embedded System
Adding Support for Networking and Web Technologies to an Embedded SystemAdding Support for Networking and Web Technologies to an Embedded System
Adding Support for Networking and Web Technologies to an Embedded System
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
 
Loom promises: be there!
Loom promises: be there!Loom promises: be there!
Loom promises: be there!
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 
Flexible compute
Flexible computeFlexible compute
Flexible compute
 
Advanced Internet of Things firmware engineering with Thingsquare and Contiki...
Advanced Internet of Things firmware engineering with Thingsquare and Contiki...Advanced Internet of Things firmware engineering with Thingsquare and Contiki...
Advanced Internet of Things firmware engineering with Thingsquare and Contiki...
 
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
 
Scaling tappsi
Scaling tappsiScaling tappsi
Scaling tappsi
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Realtime traffic analyser

  • 1. Lessons we learned while building real-time network traffic analyzer in C/C++ Alex Moskvin CEO/CTO @ Plexteq
  • 2. About myself • CEO/CTO Plexteq OÜ • Ph.D in information technology area • Interests • Software architecture • High loaded systems • Everything under the hood • AI/ML + BigData • Knowledge sharing ;) • Follow me • https://twitter.com/amoskvin • https://www.facebook.com/moskvin.aleksey
  • 3. Plexteq • High loaded backends • Complex distributed data processing pipelines • Big Data / BI • We have our custom products (hardware + software solutions) We are hiring! ;)
  • 4. Agenda 1. What was the whole stuff about 2. How we decided to solve it 3. Challenges we faced 4. Lessons we learned
  • 5. Disclaimer ;) This talk is based on personal experience. Use at your own risk.
  • 6. Task definition • Network services provider needs: • Analyse threats/interactions in past • Realtime network spikes indication • Aggregate metadata from hundreds of systems • Solution should be • fast, resource efficient (no CPU/RAM hogging) • potentially needs to be cross-platform • Easy to integrate with ETL and BI systems • Regular bandwidth: 100-1000Mbps
  • 7. Data model 2 dimensions Per port Time period Source IP Destination port Protocol type In bytes Out bytes In packets Out packets Per protocol type Time period TCP/UDP/… traffic in bytes TCP/UDP/… traffic in bytes Protocol type In bytes Out bytes In packets Out packets
  • 10. Existing solutions • tcpdump • wireshark • iptables
  • 11. Existing solutions $ for i in 1 2 3; do some tcpdump exercise done
  • 12. Existing solutions $ tcpdump -i eth0 $ tcpdump tcp port 443 $ tcpdump tcp ‘port 443 or port 80’ $ tcpdump tcp ‘port 443 or port 80’ -w out-file
  • 13. Existing solutions • Drawbacks • tcpdump / wireshark • Single threaded • Large disk space overhead (without hacking will write packet contents) • Not possible to write with custom data format (extra parsing efforts of .pcap file is needed) • Iptables • Could work, but will be hard to customize in case of further feature requests • Not cross-platform
  • 14. Existing solutions We want our own bicycle ;)
  • 15. Main functions Okay, so we want to capture traffic from the kernel. How should we do it?
  • 16. Traffic capturing • Raw sockets • pf_ring • 3rd party libraries • libtins • pcapplusplus • libpcap
  • 17. Traffic capturing :: Raw sockets
  • 18. Traffic capturing :: Raw sockets Drawbacks: • Kernel-to-userspace copies • Developer needs to be proficient with packet structure and low level networking semantics, i.e. endianness
  • 19. Traffic capturing :: Raw sockets
  • 20. Traffic capturing :: pf_ring PF_RING – kernel bypass Motivation: • Kernel is very slow  • Vanilla kernel can handle 1-2Mpps • PF_RING can do 15+Mpps on commodity hardware Pros • Huge workloads • Could be used for network server application development • Zero copy technique Cons • Complicated API • Support on network card driver level is preferred • PF_RING ZC API is complex • Not cross platform
  • 21. Traffic capturing :: 3rd party libs Pros: • Cross platform • May utilize low level OS dependent optimizations and extensions, i.e. PF_RING
  • 22. Traffic capturing :: winner libpcap • Cross platform • Supports PF_RING • The most fast implementation • Well maintained • Relatively easy API
  • 25. Solutions to store data We wanted something that: • Has small footprint and fast • Preferably one file database • Embeddable • Supports SQL • Supports B-tree indices
  • 26. Solutions to store data We wanted something that: • Has small footprint and fast • Preferably one file database • Embeddable • Supports SQL • Supports B-tree indices
  • 27. Solutions to store data We wanted something that: • Has small footprint and fast • Preferably one file database • Embeddable • Supports SQL • Supports B-tree indices Drawbacks: • Single threaded – we need to synchronize/serialize write ops to it in our application
  • 28. SQLite :: code examples
  • 29. SQLite :: code examples
  • 30. We have core tool chain now! Let’s glue it up together
  • 32. Producer-consumer problem • Issues: • Aggregator is not following up on traffic > 25Mbps • We have a significant increasing delay between incoming traffic and flushed stats This is actually a producer-consumer type of problem
  • 33. Producer-consumer problem We need to handle packets in multiple threads
  • 34. Producer-consumer problem • Solution: • Producer runs in separate thread • Multiple consumers that run in separate threads
  • 35. Producer-consumer problem • Solution: • Producer runs in separate thread • Multiple consumers that run in separate threads Possible implementations: • Message broker • Blocking queue
  • 36. Producer-consumer problem We need a blocking queue For this purpose
  • 37. Producer-consumer problem Very good implementation: APR (Apache Portable Runtime) Used by Apache web server http://apr.apache.org/docs/apr-util/1.3/apr__queue_8h.html
  • 39. Packet processing flow • Issues: • Application is capable to handle about 82Mbps of traffic flow • CPU usage is 100+% utilized by our app (eaten by malloc calls)
  • 40. Memory allocation • Issues: • Application is capable to handle about 82Mbps of traffic flow • CPU usage is 100% utilized by our app (eaten by malloc calls) • Business logic needed at least 1 malloc when packet stats got aggregated in in- memory data structure
  • 42. Malloc issue Solution: • Use memory pooling Blockpre-allocate withmalloc Allocations within a block (eventually allocation within block = pointer arithmetic)
  • 43. Malloc issue Solution: • Use memory pooling Blockpre-allocate withmalloc Allocations within a block (eventually allocation within block = pointer arithmetic) Drawbacks: • Can’t do free for an individual allocation within a block
  • 44. Packet processing flow Some implementations • APR (https://apr.apache.org/docs/apr/1.6/group__apr__pools.html) • Mpool (https://github.com/silentbicycle/mpool)
  • 46. Mutexes • Results: • Linux: • Application is capable to handle ~1Gbps of traffic flow • CPU usage is 10-15% on 4 core Xeon 2.8Ghz • FreeBSD/OSX • Application is capable to handle ~615Mbps of traffic flow • CPU usage is 35% on 4 core Xeon 2.8Ghz
  • 47. Mutexes • Results: • Linux: • Application is capable to handle ~1Gbps of traffic flow • CPU usage is 10-15% on 4 core Xeon 2.8Ghz • FreeBSD/OSX • Application is capable to handle ~615Mbps of traffic flow • CPU usage is 35% on 4 core Xeon 2.8Ghz • Possible reasons • Profiler shows a high number of thread synchronization calls from our app (pthread_mutex_lock, pthread_mutex_unlock)
  • 48. Mutexes • Investigation: • pthread_mutex_* in Linux is implemented using futexes (fast user-space mutex), no locking, no context switching • POSIX is a standard, it doesn’t require specific implementation • OSX/FreeBSD use heavier approach with
  • 50. Mutexes • Thread synhronization approaches: • Lock based • Semaphore • Mutex • Lock free • Futex (could lock in an edge case) • Spin lock • CAS based spin lock
  • 51. Mutexes • Our target critical section: • No IO operations • Just pointer operations, arithmetic operations and allocations on memory pool • Options • Spin lock from OS • Custom spin lock based on CAS operations
  • 52. Mutexes • Our target critical section: • No IO operations • Just pointer operations, arithmetic operations and allocations on memory pool • Options • Spin lock from OS • pthread_spin_lock • Custom spin lock based on CAS operations • GCC atomic built ins • __sync_lock_test_and_set • __sync_lock_release
  • 54. Mutexes 1) volatile suggests that “lock”may be changed by other threads 2) __sync_lock_test_and_set, __sync_lock_release Are atomic built ins which guarantee atomic memory access 3) __sync_lock_test_and_set atomically sets 1 and returns 0 4) If lock == 1, we keep looping until another thread calls __sync_lock_release
  • 55. Mutexes • Results: • Linux: • Application is capable to handle ~1Gbps of traffic flow • CPU usage is 10-15% on 4 core Xeon 2.8Ghz • FreeBSD/OSX • Application is capable to handle ~1Gbps of traffic flow • CPU usage is 8-12% on 4 core Xeon 2.8Ghz • Possible reasons • Profiler shows a high number of thread synchronization calls from our app (pthread_mutex_lock, pthread_mutex_unlock)

Hinweis der Redaktion

  1. Знания - nda
  2. Знания - nda
  3. packets will only be delivered to the PF_RING client, and not the kernel network stack. Since the kernel is the slow part this ensures the fastest operation
  4. If database doesn’t exist it got created
  5. If database doesn’t exist it got created
  6. How? Libpcap is single threaded
  7. How to join producers with consumers?
  8. How to join producers with consumers?
  9. What is blocking queue?
  10. What is blocking queue?
  11. What is blocking queue?
  12. Widely used in server applications – 1 request = 1 pool
  13. Widely used in server applications – 1 request = 1 pool
  14. Any propositions?
  15. Any propositions?
  16. Any propositions?
  17. Any propositions?
  18. Any propositions?
  19. Any propositions?
  20. Any propositions?