Roadmap to Membership of RICS - Pathways and Routes
P2P Netwok Traffic Classification
1. PEER TO PEER NETWORK TRAFFIC
CLASSIFICATION
LEKSHMI M NAIR
( AM.EN.P2CSE13011)
S4 M.TECH CSE
MAJOR PROJECT
GUIDED BY : Dr. G P SAJEEV
July 2, 2015
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 1 / 53
2. OUTLINE
1 Introduction to P2P networking
2 P2P network traffic
3 Need for P2P traffic classification
4 Existing classification schemes
5 System design
6 Implementation details
7 Results
8 References
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 2 / 53
3. INTRODUCTION TO ’PEER TO PEER’ (P2P)
NETWORKING
P2P NETWORK SYSTEM
Peer-to-peer (P2P) is a
decentralized communications
model in which each party has
the same capabilities and
either party can initiate a
communication session unlike
in client/server model.
Figure: P2P Network
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 3 / 53
4. P2P NETWORK TRAFFIC
P2P traffic constitute the traffic created by various P2P
applications such as BitTorrent, Skype, Napster, Gnutella etc...
P2P is generally used to pass large amounts of data, so they can
slow down your internet connection.
Figure: P2P Applications
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 4 / 53
5. NEED FOR P2P TRAFFIC CLASSIFICATION
Network design and
provisioning / Traffic
Engineering.
Optimize and control network
utilization to address QoS
assignment and traffic
shaping.
Accounting / Content based
charging.
Security monitoring.
Network Forensics.
Figure: Traffic Classification
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 5 / 53
6. NEED FOR P2P TRAFFIC CLASSIFICATION
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 6 / 53
7. EXISTING CLASSIFICATION SCHEMES
Some of the existing P2P traffic classification techniques are :
Port-based classification
Signature-based classification
Flow-based classification
Statistics-based classification
Hybrid method
Comparison
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 7 / 53
8. A BRIEF COMPARISON OF EXISTING
TECHNIQUES
Name Method Merits De-Merits Remarks
Port-
based.
Classification
based on
port number.
Simple
and fast.
Inefficient due to
random port allo-
cation.
Accuracy is
much lower.
Signature-
based.
Based on
recognition
of spe-
cific packet
payloads.
Reduces
false-
positive
and false-
negatives
High computa-
tional complexity
since each packet
needs to be
analyzed.
Inefficient on
encrypted
payloads.
Flow-
based.
Based on be-
havioral pat-
terns.
Speed. Cannot always
classify traffic
to its specified
applications
Speedup traffic
classification,
but cannot
classify all
traffics.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 8 / 53
9. A BRIEF COMPARISON OF EXISTING
TECHNIQUES ( Contd..)
Name Method Merits De-Merits Remarks
Statistics-
based.
By means of sta-
tistical features
such as packet
size, packet inter-
arrival time, and
flow duration.
More
unique-
ness.
As no. of
features
increases,
mapping
becomes
difficult.
Inefficient as no.
of features in-
creases.
Hybrid
method.
By combining
any of the above
methods.
More
accu-
rate.
Only 2-class
classifier is
implemented
till date
Scope for
UDP needs
to be deter-
mined.
Table: Survey on P2P classification techniques.
Back
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 9 / 53
10. PROJECT THEME
The performance of existing P2P traffic classification schemes are
poor. Also, there is no classification scheme to classify P2P traffic
into malicious-P2P & non-malicious P2P.
PROBLEM DEFINITION
The problem of classifying P2P traffic into malicious and non-malicious
is not addressed so far.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 10 / 53
11. DEFINITION TO MALICIOUS ACTIVITIES
1 Poisoning
2 Polluting
3 Insertion of viruses
4 Malware
5 Denial of Service
6 Spam
7 Password Stealing
8 Advertising
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 11 / 53
12. IDENTIFYING P2P TRAFFIC
P2P traffic has bi-directional nature.
Eg.- BitTorrent - seeders and leechers.
Notion of a communication more suited to P2P.
Who is talking to whom?
Both header and payload information are considered for traffic
classification.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 12 / 53
13. SYSTEM DESIGN
Figure: Network Traffic Classifier
Continue
Aggregation Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 13 / 53
14. MODULES
1. Filtering.
2. Communication Creation Module.
3. Automatic Signature Generation Module.
4. Aggregation Module.
5. Classification Module.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 14 / 53
15. PACKET FILTERING MODULE
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 15 / 53
16. PACKET FILTERING ALGORITHM
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 16 / 53
17. COMMUNICATION CREATION ALGORITHM
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 17 / 53
18. COMMUNICATION CREATION MODULE
Figure: Communication Creation Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 18 / 53
19. Classification Criterion
Features Malicious Non-Malicious
Volume Low High
Inter-arrival time Large Small
Traffic Automated/Scripted
commands
User-bursty traffic
Table: Malicious vs Non-Malicious Features
System Design
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 19 / 53
20. AUTO-SIGN MODULE
Figure: Automatic Signature Generation Module
Similarity Score
System Design
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 20 / 53
21. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 21 / 53
22. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 22 / 53
23. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 23 / 53
24. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 24 / 53
25. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 25 / 53
26. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 26 / 53
27. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 27 / 53
28. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 28 / 53
29. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 29 / 53
30. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 30 / 53
31. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 31 / 53
32. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 32 / 53
33. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 33 / 53
34. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 34 / 53
35. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 35 / 53
36. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 36 / 53
37. LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 37 / 53
38. LASER ALGORITHM
The signature refinement process can be simply expressed as follows:
Candidate_Sign_1 = Sign(Flow_1, Flow_2)
Candidate_Sign_2 = Sign(Flow_3, Candidate_Sign_1)
...
Candidate_Sign_n = Sign(Flow_n + 1, Candidate_Sign_n − 1)
If Candidate_Sign_n = Candidate_Sign_n − 1
For the certain iteration counts then Candidate_Sign_n is the final
signature.
Auto Sign Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 38 / 53
39. FLOW SIMILARITY OF UNKNOWN PACKET
TRACES
Auto Sign Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 39 / 53
40. AGGREGATION MODULE
In Communication Aggregation Module, we aggregate the results of
communication creation module and auto-sign module.
Figure: Aggregation Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 40 / 53
41. CLASSIFICATION MODULE
In Classification Module, we train the system using the generated
dataset, so that for new incoming traces we can predict whether the
traffic flow is malicious p2p or non-malicious p2p.
C4.5 decision tree algorithm is employed in classification module.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 41 / 53
42. SUMMARY (MAJOR PROJECT)
Figure: P2P Network Traffic Classifier
A hybrid technique for
p2p traffic
classification.
Combination of
signature based and
statistical method by
exploting the
communication
behaviour of the p2p
nodes.
P2P traffic is classified
into malicious and
non-malicious p2p.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 42 / 53
43. IMPLEMENTATION DETAILS
Figure: Implementation of P2P Network Traffic Classifier
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 43 / 53
44. IMPLEMENTATION DETAILS
Figure: P2P Network Traffic Classifier
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 44 / 53
45. RESULTS
The signatures of various protocols are extracted using LASER
algorithm. They are listed in the following table.
Application Signature
Azureus "POST/rpc/config", "HTTP/<version>", "User-
Agent:Azureus<version>", "Host :"
GigaTribe "GET", "&p=", "&cmd=OpenSession",
"HTTP/1.1", "User-Agent:GigaTribe",
"HTTP/1.1", "200 OK"
Zultrax "ZEPP 19 29 port"-offset(0) 0x0d0a0d0a,
"ZEPP OK number12,28,29my IP
address:port"-offset(0) 0x0d0a0d0a
Storm .mpg;size
Bitlord "GET", "HTTP", "User-Agent:BitTorrent",
"www.bitlord.com"
DC++ "GET", "HTTP", "User-Agent:DC++"
AntsP2P "NOTIFY * HTTP" "USN: uuid:ANtsP2P"
KCeasy "GET / HTTP/"offset(0) "cookie:Kceasy"
Table: Malicious vs Non-Malicious Signatures
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 45 / 53
46. RESULTS
The signatures of various protocols are extracted using LASER
algorithm. They are listed in the following table.
Application Signature
Limewire "GET" "User-Agent: LimeWire/"
"Java/"
iMesh "POST"offset(0) "function=login"
"Host: login.imesh.com"
Mute "client=MUTE&version="offset(12)
Soulseek "GET "offset(0) "User-Agent:
SoulSeek"
Skype ""GET "offset(0) "HTTP" "User-
Agent: skype"
eDonkey2000 "GET / HTTP/"offset(0)
"cookie:Kceasy"
eMule 0xe3 (offset 0)
iMesh "POST"offset(0) "function=login"
"Host: login.imesh.com"
Table: Malicious vs Non-Malicious Signatures
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 46 / 53
47. RESULTS
The evaluation parameters are estimated for 3 dataset. The results are
given in the following table.
Dataset Error Rate CCR FP FN
1. 9.5 85.31 0.095 0.169
2. 4.25 91.42 0.172 0.058
3. 12.9 84.96 0.184 0.140
Table: P2P traffic classification rates
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 47 / 53
48. RESULTS
The error rate decreases as number of records taken for training
increases. A graphical representation to substantiate this result is as
shown in Figure.
Figure: Accuracy performance of the classifier for different datasets
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 48 / 53
49. PERFORMANCE EVALUATION
The validation of the model is done using 3 classification algorithms -
namely Bayesian Network, Decision tree and Adaboost with REP
trees. The results are given in the following table.
Decision Tree Bayes Net Adaboost
TPR FPR CR TPR FPR CR TPR FPR CR
Storm 0.92 0.12 0.93 0.92 0.21 0.91 0.89 0.19 0.90
Waledac 0.93 0.17 0.95 0.96 0.22 0.93 0.90 0.15 0.91
BitTorrent 0.94 0.11 0.96 0.92 0.18 0.95 0.92 0.22 0.92
eDonkey2000 0.94 0.13 0.95 0.95 0.18 0.96 0.94 0.18 0.94
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 49 / 53
50. PUBLICATION
1 Lekshmi M Nair, and G P Sajeev. "Internet Traffic Classification by
Aggregating Correlated Decision Tree Classifier." Computational
Intelligence, Modelling and Simulation (CIMSim), 2015 Seventh
International Conference on IEEE, Kuantan, Malaysia, 27 - 29 July
2015.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 50 / 53
51. REFERENCES
Ye, Wujian, and Kyungsan Cho. "Hybrid P2P traffic classification with heuristic
rules and machine learning." Soft Computing (2014): 1-13.
Valenti, Silvio, and Dario Rossi. "Identifying key features for P2P traffic
classification." Communications (ICC), 2011 IEEE International Conference on.
IEEE, 2011.
Adibi, Sasan. "Traffic Classification-Packet-, Flow-, and Application-based
Approaches." International Journal of Advanced Computer Science and
Applications-IJACSA 1 (2010): 6-15.
Nguyen, Thuy TT, and Grenville Armitage. "A survey of techniques for internet
traffic classification using machine learning." Communications Surveys &
Tutorials, IEEE 10.4 (2008): 56-76.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 51 / 53
52. References
Narang, Pratik, et al. "Peershark: detecting peer-to-peer botnets by tracking
conversations. " Security and Privacy Workshops (SPW), 2014 IEEE. IEEE,
2014.
F. Gringoli, L. Salgarelli, M. Dusi, N. Cascarano, F. Risso and K.C. Claffy, "GT:
picking up the truth from the ground for Internet traffic", ACM SIGCOMM
Computer Communication Review, Vol. 39, No. 5, pp. 13-18, Oct. 2009.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 52 / 53
53. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 53 / 53