ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
PEER TO PEER NETWORK TRAFFIC
CLASSIFICATION
LEKSHMI M NAIR
( AM.EN.P2CSE13011)
S4 M.TECH CSE
MAJOR PROJECT
GUIDED BY : Dr. G P SAJEEV
July 2, 2015
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 1 / 53
OUTLINE
1 Introduction to P2P networking
2 P2P network traf?c
3 Need for P2P traf?c classi?cation
4 Existing classi?cation schemes
5 System design
6 Implementation details
7 Results
8 References
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 2 / 53
INTRODUCTION TO ¡¯PEER TO PEER¡¯ (P2P)
NETWORKING
P2P NETWORK SYSTEM
Peer-to-peer (P2P) is a
decentralized communications
model in which each party has
the same capabilities and
either party can initiate a
communication session unlike
in client/server model.
Figure: P2P Network
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 3 / 53
P2P NETWORK TRAFFIC
P2P traf?c constitute the traf?c created by various P2P
applications such as BitTorrent, Skype, Napster, Gnutella etc...
P2P is generally used to pass large amounts of data, so they can
slow down your internet connection.
Figure: P2P Applications
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 4 / 53
NEED FOR P2P TRAFFIC CLASSIFICATION
Network design and
provisioning / Traf?c
Engineering.
Optimize and control network
utilization to address QoS
assignment and traf?c
shaping.
Accounting / Content based
charging.
Security monitoring.
Network Forensics.
Figure: Traf?c Classi?cation
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 5 / 53
NEED FOR P2P TRAFFIC CLASSIFICATION
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 6 / 53
EXISTING CLASSIFICATION SCHEMES
Some of the existing P2P traf?c classi?cation techniques are :
Port-based classi?cation
Signature-based classi?cation
Flow-based classi?cation
Statistics-based classi?cation
Hybrid method
Comparison
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 7 / 53
A BRIEF COMPARISON OF EXISTING
TECHNIQUES
Name Method Merits De-Merits Remarks
Port-
based.
Classi?cation
based on
port number.
Simple
and fast.
Inef?cient due to
random port allo-
cation.
Accuracy is
much lower.
Signature-
based.
Based on
recognition
of spe-
ci?c packet
payloads.
Reduces
false-
positive
and false-
negatives
High computa-
tional complexity
since each packet
needs to be
analyzed.
Inef?cient on
encrypted
payloads.
Flow-
based.
Based on be-
havioral pat-
terns.
Speed. Cannot always
classify traf?c
to its speci?ed
applications
Speedup traf?c
classi?cation,
but cannot
classify all
traf?cs.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 8 / 53
A BRIEF COMPARISON OF EXISTING
TECHNIQUES ( Contd..)
Name Method Merits De-Merits Remarks
Statistics-
based.
By means of sta-
tistical features
such as packet
size, packet inter-
arrival time, and
?ow duration.
More
unique-
ness.
As no. of
features
increases,
mapping
becomes
dif?cult.
Inef?cient as no.
of features in-
creases.
Hybrid
method.
By combining
any of the above
methods.
More
accu-
rate.
Only 2-class
classi?er is
implemented
till date
Scope for
UDP needs
to be deter-
mined.
Table: Survey on P2P classi?cation techniques.
Back
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 9 / 53
PROJECT THEME
The performance of existing P2P traf?c classi?cation schemes are
poor. Also, there is no classi?cation scheme to classify P2P traf?c
into malicious-P2P & non-malicious P2P.
PROBLEM DEFINITION
The problem of classifying P2P traf?c into malicious and non-malicious
is not addressed so far.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 10 / 53
DEFINITION TO MALICIOUS ACTIVITIES
1 Poisoning
2 Polluting
3 Insertion of viruses
4 Malware
5 Denial of Service
6 Spam
7 Password Stealing
8 Advertising
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 11 / 53
IDENTIFYING P2P TRAFFIC
P2P traf?c has bi-directional nature.
Eg.- BitTorrent - seeders and leechers.
Notion of a communication more suited to P2P.
Who is talking to whom?
Both header and payload information are considered for traf?c
classi?cation.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 12 / 53
SYSTEM DESIGN
Figure: Network Traf?c Classi?er
Continue
Aggregation Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 13 / 53
MODULES
1. Filtering.
2. Communication Creation Module.
3. Automatic Signature Generation Module.
4. Aggregation Module.
5. Classi?cation Module.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 14 / 53
PACKET FILTERING MODULE
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 15 / 53
PACKET FILTERING ALGORITHM
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 16 / 53
COMMUNICATION CREATION ALGORITHM
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 17 / 53
COMMUNICATION CREATION MODULE
Figure: Communication Creation Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 18 / 53
Classi?cation Criterion
Features Malicious Non-Malicious
Volume Low High
Inter-arrival time Large Small
Traf?c Automated/Scripted
commands
User-bursty traf?c
Table: Malicious vs Non-Malicious Features
System Design
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 19 / 53
AUTO-SIGN MODULE
Figure: Automatic Signature Generation Module
Similarity Score
System Design
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 20 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 21 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 22 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 23 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 24 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 25 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 26 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 27 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 28 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 29 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 30 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 31 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 32 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 33 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 34 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 35 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 36 / 53
LCS (Example)
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 37 / 53
LASER ALGORITHM
The signature re?nement process can be simply expressed as follows:
Candidate_Sign_1 = Sign(Flow_1, Flow_2)
Candidate_Sign_2 = Sign(Flow_3, Candidate_Sign_1)
...
Candidate_Sign_n = Sign(Flow_n + 1, Candidate_Sign_n ? 1)
If Candidate_Sign_n = Candidate_Sign_n ? 1
For the certain iteration counts then Candidate_Sign_n is the ?nal
signature.
Auto Sign Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 38 / 53
FLOW SIMILARITY OF UNKNOWN PACKET
TRACES
Auto Sign Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 39 / 53
AGGREGATION MODULE
In Communication Aggregation Module, we aggregate the results of
communication creation module and auto-sign module.
Figure: Aggregation Module
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 40 / 53
CLASSIFICATION MODULE
In Classi?cation Module, we train the system using the generated
dataset, so that for new incoming traces we can predict whether the
traf?c ?ow is malicious p2p or non-malicious p2p.
C4.5 decision tree algorithm is employed in classi?cation module.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 41 / 53
SUMMARY (MAJOR PROJECT)
Figure: P2P Network Traf?c Classi?er
A hybrid technique for
p2p traf?c
classi?cation.
Combination of
signature based and
statistical method by
exploting the
communication
behaviour of the p2p
nodes.
P2P traf?c is classi?ed
into malicious and
non-malicious p2p.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 42 / 53
IMPLEMENTATION DETAILS
Figure: Implementation of P2P Network Traf?c Classi?er
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 43 / 53
IMPLEMENTATION DETAILS
Figure: P2P Network Traf?c Classi?er
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 44 / 53
RESULTS
The signatures of various protocols are extracted using LASER
algorithm. They are listed in the following table.
Application Signature
Azureus "POST/rpc/con?g", "HTTP/<version>", "User-
Agent:Azureus<version>", "Host :"
GigaTribe "GET", "&p=", "&cmd=OpenSession",
"HTTP/1.1", "User-Agent:GigaTribe",
"HTTP/1.1", "200 OK"
Zultrax "ZEPP 19 29 port"-offset(0) 0x0d0a0d0a,
"ZEPP OK number12,28,29my IP
address:port"-offset(0) 0x0d0a0d0a
Storm .mpg;size
Bitlord "GET", "HTTP", "User-Agent:BitTorrent",
"www.bitlord.com"
DC++ "GET", "HTTP", "User-Agent:DC++"
AntsP2P "NOTIFY * HTTP" "USN: uuid:ANtsP2P"
KCeasy "GET / HTTP/"offset(0) "cookie:Kceasy"
Table: Malicious vs Non-Malicious Signatures
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 45 / 53
RESULTS
The signatures of various protocols are extracted using LASER
algorithm. They are listed in the following table.
Application Signature
Limewire "GET" "User-Agent: LimeWire/"
"Java/"
iMesh "POST"offset(0) "function=login"
"Host: login.imesh.com"
Mute "client=MUTE&version="offset(12)
Soulseek "GET "offset(0) "User-Agent:
SoulSeek"
Skype ""GET "offset(0) "HTTP" "User-
Agent: skype"
eDonkey2000 "GET / HTTP/"offset(0)
"cookie:Kceasy"
eMule 0xe3 (offset 0)
iMesh "POST"offset(0) "function=login"
"Host: login.imesh.com"
Table: Malicious vs Non-Malicious Signatures
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 46 / 53
RESULTS
The evaluation parameters are estimated for 3 dataset. The results are
given in the following table.
Dataset Error Rate CCR FP FN
1. 9.5 85.31 0.095 0.169
2. 4.25 91.42 0.172 0.058
3. 12.9 84.96 0.184 0.140
Table: P2P traf?c classi?cation rates
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 47 / 53
RESULTS
The error rate decreases as number of records taken for training
increases. A graphical representation to substantiate this result is as
shown in Figure.
Figure: Accuracy performance of the classi?er for different datasets
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 48 / 53
PERFORMANCE EVALUATION
The validation of the model is done using 3 classi?cation algorithms -
namely Bayesian Network, Decision tree and Adaboost with REP
trees. The results are given in the following table.
Decision Tree Bayes Net Adaboost
TPR FPR CR TPR FPR CR TPR FPR CR
Storm 0.92 0.12 0.93 0.92 0.21 0.91 0.89 0.19 0.90
Waledac 0.93 0.17 0.95 0.96 0.22 0.93 0.90 0.15 0.91
BitTorrent 0.94 0.11 0.96 0.92 0.18 0.95 0.92 0.22 0.92
eDonkey2000 0.94 0.13 0.95 0.95 0.18 0.96 0.94 0.18 0.94
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 49 / 53
PUBLICATION
1 Lekshmi M Nair, and G P Sajeev. "Internet Traf?c Classi?cation by
Aggregating Correlated Decision Tree Classi?er." Computational
Intelligence, Modelling and Simulation (CIMSim), 2015 Seventh
International Conference on IEEE, Kuantan, Malaysia, 27 - 29 July
2015.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 50 / 53
REFERENCES
Ye, Wujian, and Kyungsan Cho. "Hybrid P2P traf?c classi?cation with heuristic
rules and machine learning." Soft Computing (2014): 1-13.
Valenti, Silvio, and Dario Rossi. "Identifying key features for P2P traf?c
classi?cation." Communications (ICC), 2011 IEEE International Conference on.
IEEE, 2011.
Adibi, Sasan. "Traf?c Classi?cation-Packet-, Flow-, and Application-based
Approaches." International Journal of Advanced Computer Science and
Applications-IJACSA 1 (2010): 6-15.
Nguyen, Thuy TT, and Grenville Armitage. "A survey of techniques for internet
traf?c classi?cation using machine learning." Communications Surveys &
Tutorials, IEEE 10.4 (2008): 56-76.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 51 / 53
References
Narang, Pratik, et al. "Peershark: detecting peer-to-peer botnets by tracking
conversations. " Security and Privacy Workshops (SPW), 2014 IEEE. IEEE,
2014.
F. Gringoli, L. Salgarelli, M. Dusi, N. Cascarano, F. Risso and K.C. Claffy, "GT:
picking up the truth from the ground for Internet traf?c", ACM SIGCOMM
Computer Communication Review, Vol. 39, No. 5, pp. 13-18, Oct. 2009.
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 52 / 53
LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 53 / 53

More Related Content

P2P Netwok Traffic Classification

  • 1. PEER TO PEER NETWORK TRAFFIC CLASSIFICATION LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT GUIDED BY : Dr. G P SAJEEV July 2, 2015 LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 1 / 53
  • 2. OUTLINE 1 Introduction to P2P networking 2 P2P network traf?c 3 Need for P2P traf?c classi?cation 4 Existing classi?cation schemes 5 System design 6 Implementation details 7 Results 8 References LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 2 / 53
  • 3. INTRODUCTION TO ¡¯PEER TO PEER¡¯ (P2P) NETWORKING P2P NETWORK SYSTEM Peer-to-peer (P2P) is a decentralized communications model in which each party has the same capabilities and either party can initiate a communication session unlike in client/server model. Figure: P2P Network LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 3 / 53
  • 4. P2P NETWORK TRAFFIC P2P traf?c constitute the traf?c created by various P2P applications such as BitTorrent, Skype, Napster, Gnutella etc... P2P is generally used to pass large amounts of data, so they can slow down your internet connection. Figure: P2P Applications LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 4 / 53
  • 5. NEED FOR P2P TRAFFIC CLASSIFICATION Network design and provisioning / Traf?c Engineering. Optimize and control network utilization to address QoS assignment and traf?c shaping. Accounting / Content based charging. Security monitoring. Network Forensics. Figure: Traf?c Classi?cation LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 5 / 53
  • 6. NEED FOR P2P TRAFFIC CLASSIFICATION LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 6 / 53
  • 7. EXISTING CLASSIFICATION SCHEMES Some of the existing P2P traf?c classi?cation techniques are : Port-based classi?cation Signature-based classi?cation Flow-based classi?cation Statistics-based classi?cation Hybrid method Comparison LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 7 / 53
  • 8. A BRIEF COMPARISON OF EXISTING TECHNIQUES Name Method Merits De-Merits Remarks Port- based. Classi?cation based on port number. Simple and fast. Inef?cient due to random port allo- cation. Accuracy is much lower. Signature- based. Based on recognition of spe- ci?c packet payloads. Reduces false- positive and false- negatives High computa- tional complexity since each packet needs to be analyzed. Inef?cient on encrypted payloads. Flow- based. Based on be- havioral pat- terns. Speed. Cannot always classify traf?c to its speci?ed applications Speedup traf?c classi?cation, but cannot classify all traf?cs. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 8 / 53
  • 9. A BRIEF COMPARISON OF EXISTING TECHNIQUES ( Contd..) Name Method Merits De-Merits Remarks Statistics- based. By means of sta- tistical features such as packet size, packet inter- arrival time, and ?ow duration. More unique- ness. As no. of features increases, mapping becomes dif?cult. Inef?cient as no. of features in- creases. Hybrid method. By combining any of the above methods. More accu- rate. Only 2-class classi?er is implemented till date Scope for UDP needs to be deter- mined. Table: Survey on P2P classi?cation techniques. Back LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 9 / 53
  • 10. PROJECT THEME The performance of existing P2P traf?c classi?cation schemes are poor. Also, there is no classi?cation scheme to classify P2P traf?c into malicious-P2P & non-malicious P2P. PROBLEM DEFINITION The problem of classifying P2P traf?c into malicious and non-malicious is not addressed so far. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 10 / 53
  • 11. DEFINITION TO MALICIOUS ACTIVITIES 1 Poisoning 2 Polluting 3 Insertion of viruses 4 Malware 5 Denial of Service 6 Spam 7 Password Stealing 8 Advertising LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 11 / 53
  • 12. IDENTIFYING P2P TRAFFIC P2P traf?c has bi-directional nature. Eg.- BitTorrent - seeders and leechers. Notion of a communication more suited to P2P. Who is talking to whom? Both header and payload information are considered for traf?c classi?cation. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 12 / 53
  • 13. SYSTEM DESIGN Figure: Network Traf?c Classi?er Continue Aggregation Module LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 13 / 53
  • 14. MODULES 1. Filtering. 2. Communication Creation Module. 3. Automatic Signature Generation Module. 4. Aggregation Module. 5. Classi?cation Module. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 14 / 53
  • 15. PACKET FILTERING MODULE LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 15 / 53
  • 16. PACKET FILTERING ALGORITHM LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 16 / 53
  • 17. COMMUNICATION CREATION ALGORITHM LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 17 / 53
  • 18. COMMUNICATION CREATION MODULE Figure: Communication Creation Module LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 18 / 53
  • 19. Classi?cation Criterion Features Malicious Non-Malicious Volume Low High Inter-arrival time Large Small Traf?c Automated/Scripted commands User-bursty traf?c Table: Malicious vs Non-Malicious Features System Design LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 19 / 53
  • 20. AUTO-SIGN MODULE Figure: Automatic Signature Generation Module Similarity Score System Design LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 20 / 53
  • 21. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 21 / 53
  • 22. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 22 / 53
  • 23. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 23 / 53
  • 24. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 24 / 53
  • 25. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 25 / 53
  • 26. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 26 / 53
  • 27. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 27 / 53
  • 28. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 28 / 53
  • 29. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 29 / 53
  • 30. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 30 / 53
  • 31. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 31 / 53
  • 32. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 32 / 53
  • 33. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 33 / 53
  • 34. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 34 / 53
  • 35. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 35 / 53
  • 36. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 36 / 53
  • 37. LCS (Example) LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 37 / 53
  • 38. LASER ALGORITHM The signature re?nement process can be simply expressed as follows: Candidate_Sign_1 = Sign(Flow_1, Flow_2) Candidate_Sign_2 = Sign(Flow_3, Candidate_Sign_1) ... Candidate_Sign_n = Sign(Flow_n + 1, Candidate_Sign_n ? 1) If Candidate_Sign_n = Candidate_Sign_n ? 1 For the certain iteration counts then Candidate_Sign_n is the ?nal signature. Auto Sign Module LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 38 / 53
  • 39. FLOW SIMILARITY OF UNKNOWN PACKET TRACES Auto Sign Module LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 39 / 53
  • 40. AGGREGATION MODULE In Communication Aggregation Module, we aggregate the results of communication creation module and auto-sign module. Figure: Aggregation Module LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 40 / 53
  • 41. CLASSIFICATION MODULE In Classi?cation Module, we train the system using the generated dataset, so that for new incoming traces we can predict whether the traf?c ?ow is malicious p2p or non-malicious p2p. C4.5 decision tree algorithm is employed in classi?cation module. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 41 / 53
  • 42. SUMMARY (MAJOR PROJECT) Figure: P2P Network Traf?c Classi?er A hybrid technique for p2p traf?c classi?cation. Combination of signature based and statistical method by exploting the communication behaviour of the p2p nodes. P2P traf?c is classi?ed into malicious and non-malicious p2p. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 42 / 53
  • 43. IMPLEMENTATION DETAILS Figure: Implementation of P2P Network Traf?c Classi?er LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 43 / 53
  • 44. IMPLEMENTATION DETAILS Figure: P2P Network Traf?c Classi?er LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 44 / 53
  • 45. RESULTS The signatures of various protocols are extracted using LASER algorithm. They are listed in the following table. Application Signature Azureus "POST/rpc/con?g", "HTTP/<version>", "User- Agent:Azureus<version>", "Host :" GigaTribe "GET", "&p=", "&cmd=OpenSession", "HTTP/1.1", "User-Agent:GigaTribe", "HTTP/1.1", "200 OK" Zultrax "ZEPP 19 29 port"-offset(0) 0x0d0a0d0a, "ZEPP OK number12,28,29my IP address:port"-offset(0) 0x0d0a0d0a Storm .mpg;size Bitlord "GET", "HTTP", "User-Agent:BitTorrent", "www.bitlord.com" DC++ "GET", "HTTP", "User-Agent:DC++" AntsP2P "NOTIFY * HTTP" "USN: uuid:ANtsP2P" KCeasy "GET / HTTP/"offset(0) "cookie:Kceasy" Table: Malicious vs Non-Malicious Signatures LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 45 / 53
  • 46. RESULTS The signatures of various protocols are extracted using LASER algorithm. They are listed in the following table. Application Signature Limewire "GET" "User-Agent: LimeWire/" "Java/" iMesh "POST"offset(0) "function=login" "Host: login.imesh.com" Mute "client=MUTE&version="offset(12) Soulseek "GET "offset(0) "User-Agent: SoulSeek" Skype ""GET "offset(0) "HTTP" "User- Agent: skype" eDonkey2000 "GET / HTTP/"offset(0) "cookie:Kceasy" eMule 0xe3 (offset 0) iMesh "POST"offset(0) "function=login" "Host: login.imesh.com" Table: Malicious vs Non-Malicious Signatures LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 46 / 53
  • 47. RESULTS The evaluation parameters are estimated for 3 dataset. The results are given in the following table. Dataset Error Rate CCR FP FN 1. 9.5 85.31 0.095 0.169 2. 4.25 91.42 0.172 0.058 3. 12.9 84.96 0.184 0.140 Table: P2P traf?c classi?cation rates LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 47 / 53
  • 48. RESULTS The error rate decreases as number of records taken for training increases. A graphical representation to substantiate this result is as shown in Figure. Figure: Accuracy performance of the classi?er for different datasets LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 48 / 53
  • 49. PERFORMANCE EVALUATION The validation of the model is done using 3 classi?cation algorithms - namely Bayesian Network, Decision tree and Adaboost with REP trees. The results are given in the following table. Decision Tree Bayes Net Adaboost TPR FPR CR TPR FPR CR TPR FPR CR Storm 0.92 0.12 0.93 0.92 0.21 0.91 0.89 0.19 0.90 Waledac 0.93 0.17 0.95 0.96 0.22 0.93 0.90 0.15 0.91 BitTorrent 0.94 0.11 0.96 0.92 0.18 0.95 0.92 0.22 0.92 eDonkey2000 0.94 0.13 0.95 0.95 0.18 0.96 0.94 0.18 0.94 LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 49 / 53
  • 50. PUBLICATION 1 Lekshmi M Nair, and G P Sajeev. "Internet Traf?c Classi?cation by Aggregating Correlated Decision Tree Classi?er." Computational Intelligence, Modelling and Simulation (CIMSim), 2015 Seventh International Conference on IEEE, Kuantan, Malaysia, 27 - 29 July 2015. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 50 / 53
  • 51. REFERENCES Ye, Wujian, and Kyungsan Cho. "Hybrid P2P traf?c classi?cation with heuristic rules and machine learning." Soft Computing (2014): 1-13. Valenti, Silvio, and Dario Rossi. "Identifying key features for P2P traf?c classi?cation." Communications (ICC), 2011 IEEE International Conference on. IEEE, 2011. Adibi, Sasan. "Traf?c Classi?cation-Packet-, Flow-, and Application-based Approaches." International Journal of Advanced Computer Science and Applications-IJACSA 1 (2010): 6-15. Nguyen, Thuy TT, and Grenville Armitage. "A survey of techniques for internet traf?c classi?cation using machine learning." Communications Surveys & Tutorials, IEEE 10.4 (2008): 56-76. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 51 / 53
  • 52. References Narang, Pratik, et al. "Peershark: detecting peer-to-peer botnets by tracking conversations. " Security and Privacy Workshops (SPW), 2014 IEEE. IEEE, 2014. F. Gringoli, L. Salgarelli, M. Dusi, N. Cascarano, F. Risso and K.C. Claffy, "GT: picking up the truth from the ground for Internet traf?c", ACM SIGCOMM Computer Communication Review, Vol. 39, No. 5, pp. 13-18, Oct. 2009. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 52 / 53
  • 53. LEKSHMI M NAIR ( AM.EN.P2CSE13011) S4 M.TECH CSE MAJOR PROJECT (GUIDED BY : Dr. G P SAJEEV)P2P TRAFFIC CLASSIFICATION July 2, 2015 53 / 53