SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Chen‐Chi Wu1, Kuan‐Ta Chen2
                                      Yu‐Chun Chang1, Chin‐Laung Lei1

1Department of Electrical Engineering, National Taiwan University

               2Institute of Information Science, Academia Sinica




IPTComm 2008                                                        1
Outline
    Motivation
    Methodology
    Performance evaluation
    Summary




IPTComm 2008                 2
Motivation
    VoIP is becoming popular because of
         Low call cost
         High voice quality
    Skype, a popular VoIP application
           over 10,000,000 concurrent users
    Accurately identifying VoIP flows from the network traffic 
    is required
         Traffic analysis
         Traffic management

IPTComm 2008                                                  3
Motivation
    Challenges of VoIP flows identification
         Various signaling protocols: SIP, H.323, various proprietary 
         protocols
         Non‐standard port numbers
         Packet payload encryption

    The interaction of human conversation is unique
         result in a specific characteristic of VoIP traffic



IPTComm 2008                                                             4
4‐State Traffic Pattern
    Infer the on/off (talking/silence) pattern by the level of the 
    packet rate during a short period
    We model a two‐way conversation by a process of four states
         State A: a period that speaker A is talking and B is silent
         State B: B is talking and A is silent
         State D: both A and B are talking
         State M: mutual silence             ON       OFF



                                                                       ON

                  OFF

IPTComm 2008                                                                5
Intuition behind Our Approach
     The 4‐state traffic pattern of VoIP traffic is unique 
     compared to that of other network applications

                Web

  P2P (BitTorrent)
Online game (WoW)

          TELNET

       VoIP (Skype)
                                          A or B   D     M


 IPTComm 2008                                                 6
Methodology
    Detect VoIP flows based on the unique human speech 
    conversation patterns embedded in voice traffic
    Derive features (attributes) from the conversation 
    patterns
    Adopt naïve Bayesian classifier, a supervised machine 
    learning tool, to divide traffic into the VoIP and non‐VoIP 
    class
         The class label of each training data is required



IPTComm 2008                                                       7
Methodology Overview
        Training phase                                        Identification phase
                                                                   Incoming flows 
Labeled training flows                                             (unknown class)
  (VoIP or non‐VoIP)                                                    Extract conversation
                                                                        patterns and derive
               Extract 4‐state traffic       Naïve                      features   
               patterns and derive         Bayesian
               features   
                                           Classifier               Flow vectors
                        Learn classifier                Classify
                        parameters
    Flow vectors                                                    Flow labels
                                                                    (VoIP or non‐VoIP) 
IPTComm 2008                                                                              8
Naïve Bayesian Classifier
 Naïve Bayesian classifier is based on the Bayes’ theorem
                               P( B | A) P( A)
                   P( A | B) =
                                   P( B)


       Each flow is represented by a vector  X = (x1, x2,…, xn),  
       depicting n features A1, A2,…, An

       Suppose there are m classes, C1, C2,…, Cm


IPTComm 2008                                                         9
Naïve Bayesian Classifier
    Given a flow vector X, the classifier predicts the flow 
    belongs to class Ci iff
        P (C i | X ) > P (C j | X ) for 1 ≤ j ≤ m, j ≠ i

    By Bayes’ theorem
                                   P( X | Ci ) P(Ci )
                    P(Ci | X ) =
                                        P( X )
    P ( X ) is constant and             is the prior probability, thus 
                            P (Ci )
    the task is to maximize
                           P ( X | Ci )
IPTComm 2008                                                              10
Naïve Bayesian Classifier
    The naïve assumption is that the values of the features 
    are conditionally independent of one another
                                  n
                 P ( X | C i ) = ∏ P ( x k | Ci )
                                 k =1

                 = P( x1 | Ci ) × P ( x2 | Ci ) ×   × P ( x n | Ci )


    P ( x1 | Ci ), P ( x2 | Ci ),..., P ( xn | Ci ) can be easily estimated 
    from the training data


IPTComm 2008                                                                   11
How to derive features from the 4‐
state traffic pattern?
    Use a Markov chain to model the VoIP traffic pattern
    Statistics of traffic patterns


         Web

          P2P
        WoW

      TELNET

          VoIP
                                    A or B   D    M
IPTComm 2008                                               12
Markov Chain
    Build a Markov chain model based on a set of known VoIP 
    traffic patterns
    Derive a feature – likelihood value
                                      Transition probabilities of the Markov chain
                                                 A         B        D         M
                                       A      0.9022   0.0028    0.0380    0.0571
                                       B      0.0029   0.9030    0.0391    0.0550
                                       D      0.0607   0.0592    0.8763    0.0038
                                       M      0.0465   0.0439    0.0019    0.9078


               4‐state Markov chain
IPTComm 2008                                                                      13
Likelihood of Traffic Patterns
    Given a traffic pattern with a state sequence S1, S2,…, Sn, 
    where Si ∈ { A, B, D, M }
    Compute the log‐likelihood value as
                  log( P , 2 × P2,3 × × P( n −1) n )
                        1

            Pi,j : the transition probability from Si to Sj
    Traffic flows may vary in length, thus define the 
    normalized log‐likelihood value as
                     log( P , 2 × P2,3 × × P( n −1) n )
                           1

                               N
               N: the length of the sequence
IPTComm 2008                                                       14
Likelihood of Traffic Patterns
    The Markov chain represents typical human conversation

    VoIP flows => large log‐likelihood value

    Non‐VoIP flows => low log‐likelihood value
    Exhibit non‐human‐like behavior: non‐interactive, 
    independent, unidirectional




IPTComm 2008                                                 15
Statistics of Traffic Patterns
    Mean of the period that party A (or B) is ON (talking) each 
    time (also compute the standard deviation)
         Bidirectional behavior

    Mean and standard deviation of the sojourn time in 
    states A, B, D, M, respectively
         Interactive behavior

    State alternation frequency
         Fragmented and disordered level of traffic pattern

IPTComm 2008                                                  16
Statistics of Traffic Patterns
    State alternation frequency
         Alternation frequency between different states




         E.g., (6 alternations between different states) / (20 sec.)
IPTComm 2008                                                           17
Feature Summary
                                       Feature set
               Normalized log‐likelihood value based on the Markov 
               chain
               Speech period of party A or B (mean, standard deviation)
               Sojourn time in each states* (mean, standard deviation)
               Ratio of sojourn time in each states*
               Alternation rate between states*
               *states A, B, D, M




IPTComm 2008                                                              18
Methodology
        Training phase                                        Identification phase
                                                                   Incoming flows 
Labeled training flows                                             (unknown class)
  (VoIP or non‐VoIP)                                                    Extract conversation
                                                                        patterns and derive
               Extract 4‐state traffic       Naïve                      features   
               patterns and derive         Bayesian
               features   
                                           Classifier               Flow vectors
                        Learn classifier                Classify
                        parameters
    Flow vectors                                                    Flow labels
                                                                    (VoIP or non‐VoIP) 
IPTComm 2008                                                                             19
Trace Collection
       We collected network traffic from 5 categories of 
       applications
         VoIP (Skype), TELNET, Web, P2P (BitTorrent), online game 
         (World of Warcraft)

Category       # Connections    Duration      # Packets      Bytes
VoIP                      462   2,388 (min)     4,728,240    4,318 (MB)
TELNET                  2,008   4,729 (min)    10,559,261    7,331 (MB)
Web                     1,406   1,537 (min)     2,528,359     680 (MB)
P2P                    15,845   3,334 (min)    29,220,870   30,500 (MB)
Online game             2,224     120 (min)    28,264,360   59,097 (MB)

IPTComm 2008                                                              20
Performance Evaluation
    Detect VoIP flows as early as possible
         Detection time is a major concern
         95% accuracy with 4‐second detection time
         97% accuracy with 11‐second detection time




IPTComm 2008                                          21
Performance Evaluation
    Goal        detect VoIP flows
         VoIP flows         positives, non‐VoIP flows        negatives
    True positive rate
             The  number  of  VoIP  flows  correctly  identified
       TPR =
                     The  number  of  total  VoIP  flows
   False positive rate
        The  number  of  non ‐ VoIP  flows  correctly  identified
  FPR =
                The  number  of  total  non ‐ VoIP  flows
    True negative rate
IPTComm 2008                                                             22
Performance Evaluation
    97% TPR with a detection time longer than 3 sec.
    Flows of World of Warcraft tend to be mis‐identified
         Achieve 90% TNR with a detection time longer than 10 sec.




IPTComm 2008                                                         23
ROC Curves
    ROC (Receiver Operating Characteristic)




IPTComm 2008                                  24
Summary
    Propose a VoIP flow identification scheme based on 
    human conversation patterns

    Our scheme yields an identification accuracy 95% within 
    4 sec. of the detection time, and 97% within 11 sec.

    High accuracy in short detection time




IPTComm 2008                                                   25
Thanks for your attention



IPTComm 2008                     26

Weitere ähnliche Inhalte

Was ist angesagt?

TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...Ahmed Ayadi
 
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...IDES Editor
 
Ber performance analysis of mimo systems using equalization
Ber performance analysis of mimo systems using equalizationBer performance analysis of mimo systems using equalization
Ber performance analysis of mimo systems using equalizationAlexander Decker
 
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...Videoguy
 
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...IDES Editor
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...ijwmn
 
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic DispersionPerformance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic DispersionIDES Editor
 
Multinode Cooperative Communications with Generalized Combining Schemes
Multinode Cooperative Communications with Generalized Combining SchemesMultinode Cooperative Communications with Generalized Combining Schemes
Multinode Cooperative Communications with Generalized Combining Schemesakrambedoui
 
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...IDES Editor
 
LREProxy module for Kamailio Presenation
LREProxy module for Kamailio PresenationLREProxy module for Kamailio Presenation
LREProxy module for Kamailio PresenationMojtaba Esfandiari
 
Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.Antonio Marcos Alberti
 
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...Rosdiadee Nordin
 
A survey on transfer learning
A survey on transfer learningA survey on transfer learning
A survey on transfer learningazuring
 

Was ist angesagt? (20)

TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...TCP over low-power and lossy networks: tuning the segment size to minimize en...
TCP over low-power and lossy networks: tuning the segment size to minimize en...
 
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
Dynamic Spectrum Derived Mfcc and Hfcc Parameters and Human Robot Speech Inte...
 
Ipmc003 2
Ipmc003 2Ipmc003 2
Ipmc003 2
 
Er24902905
Er24902905Er24902905
Er24902905
 
Cell Tech V09 0312
Cell Tech V09 0312Cell Tech V09 0312
Cell Tech V09 0312
 
Ber performance analysis of mimo systems using equalization
Ber performance analysis of mimo systems using equalizationBer performance analysis of mimo systems using equalization
Ber performance analysis of mimo systems using equalization
 
intro_dgital_TV
intro_dgital_TVintro_dgital_TV
intro_dgital_TV
 
Ijetr021253
Ijetr021253Ijetr021253
Ijetr021253
 
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
EXPERIENCES WITH HIGH DEFINITION INTERACTIVE VIDEO ...
 
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
On the Performance Analysis of Multi-antenna Relaying System over Rayleigh Fa...
 
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
Performance analysis and implementation for nonbinary quasi cyclic ldpc decod...
 
Iy3116761679
Iy3116761679Iy3116761679
Iy3116761679
 
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic DispersionPerformance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
Performance Analysis of M-ary Optical CDMA in Presence of Chromatic Dispersion
 
Multinode Cooperative Communications with Generalized Combining Schemes
Multinode Cooperative Communications with Generalized Combining SchemesMultinode Cooperative Communications with Generalized Combining Schemes
Multinode Cooperative Communications with Generalized Combining Schemes
 
report
reportreport
report
 
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
Development of Robust Adaptive Inverse models using Bacterial Foraging Optimi...
 
LREProxy module for Kamailio Presenation
LREProxy module for Kamailio PresenationLREProxy module for Kamailio Presenation
LREProxy module for Kamailio Presenation
 
Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.Apresentação feita em 2005 no Annual Simulation Symposium.
Apresentação feita em 2005 no Annual Simulation Symposium.
 
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
HGS-Assisted Detection Algorithm for 4G and Beyond Wireless Mobile Communicat...
 
A survey on transfer learning
A survey on transfer learningA survey on transfer learning
A survey on transfer learning
 

Ähnlich wie Detecting VoIP Traffic Based on Human Conversation Patterns

Inferring Speech Activity from Encrypted Skype Traffic
Inferring Speech Activity from Encrypted Skype TrafficInferring Speech Activity from Encrypted Skype Traffic
Inferring Speech Activity from Encrypted Skype TrafficAcademia Sinica
 
Producing simulation sequences by use of a Java-based Framework
Producing simulation sequences by use of a Java-based FrameworkProducing simulation sequences by use of a Java-based Framework
Producing simulation sequences by use of a Java-based FrameworkDaniele Gianni
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016Hagai Aronowitz
 
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2 [Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2 Hayoung Yoon
 
Sucha_ICC_2012
Sucha_ICC_2012Sucha_ICC_2012
Sucha_ICC_2012sucha
 
DIANA: Scenarios for QoS based integration of IP and ATM
DIANA: Scenarios for QoS based integration of IP and ATMDIANA: Scenarios for QoS based integration of IP and ATM
DIANA: Scenarios for QoS based integration of IP and ATMJohn Loughney
 
Cn osi model
Cn osi modelCn osi model
Cn osi modelNAME245
 
Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015Yang Hong
 
Prototyping Business Processes
Prototyping Business ProcessesPrototyping Business Processes
Prototyping Business ProcessesAng Chen
 
2016 06-10-ieee-sdn (1)
2016 06-10-ieee-sdn (1)2016 06-10-ieee-sdn (1)
2016 06-10-ieee-sdn (1)ICT PRISTINE
 
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}guest3f9c6b
 
PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!Kara Van Malssen
 
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)imec.archive
 
Quantifying Skype User Satisfaction
Quantifying Skype User SatisfactionQuantifying Skype User Satisfaction
Quantifying Skype User SatisfactionAcademia Sinica
 
Peer-to-Peer Application Recognition Based on Signaling Activity
Peer-to-Peer Application Recognition Based on Signaling ActivityPeer-to-Peer Application Recognition Based on Signaling Activity
Peer-to-Peer Application Recognition Based on Signaling ActivityAcademia Sinica
 
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Vishalkumarec
 

Ähnlich wie Detecting VoIP Traffic Based on Human Conversation Patterns (20)

Inferring Speech Activity from Encrypted Skype Traffic
Inferring Speech Activity from Encrypted Skype TrafficInferring Speech Activity from Encrypted Skype Traffic
Inferring Speech Activity from Encrypted Skype Traffic
 
Producing simulation sequences by use of a Java-based Framework
Producing simulation sequences by use of a Java-based FrameworkProducing simulation sequences by use of a Java-based Framework
Producing simulation sequences by use of a Java-based Framework
 
Google and SRI talk September 2016
Google and SRI talk September 2016Google and SRI talk September 2016
Google and SRI talk September 2016
 
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2 [Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
[Nov./2010] Adaptive Video Streaming over Wireless LAN with ns-2
 
Sucha_ICC_2012
Sucha_ICC_2012Sucha_ICC_2012
Sucha_ICC_2012
 
DIANA: Scenarios for QoS based integration of IP and ATM
DIANA: Scenarios for QoS based integration of IP and ATMDIANA: Scenarios for QoS based integration of IP and ATM
DIANA: Scenarios for QoS based integration of IP and ATM
 
Cn osi model
Cn osi modelCn osi model
Cn osi model
 
NET2.PPT
NET2.PPTNET2.PPT
NET2.PPT
 
Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015Traffic classification svm_im2015_10may2015
Traffic classification svm_im2015_10may2015
 
Prototyping Business Processes
Prototyping Business ProcessesPrototyping Business Processes
Prototyping Business Processes
 
2016 06-10-ieee-sdn (1)
2016 06-10-ieee-sdn (1)2016 06-10-ieee-sdn (1)
2016 06-10-ieee-sdn (1)
 
Day01
Day01 Day01
Day01
 
Hv3414491454
Hv3414491454Hv3414491454
Hv3414491454
 
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
Digital Communications Jntu Model Paper{Www.Studentyogi.Com}
 
PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!PBCore, METS, PREMIS, MODS, METSRights...oh my!
PBCore, METS, PREMIS, MODS, METSRights...oh my!
 
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)I Minds2009 Future Networks  Prof  Piet Demeester (Ibbt Ibcn U Gent)
I Minds2009 Future Networks Prof Piet Demeester (Ibbt Ibcn U Gent)
 
Quantifying Skype User Satisfaction
Quantifying Skype User SatisfactionQuantifying Skype User Satisfaction
Quantifying Skype User Satisfaction
 
Peer-to-Peer Application Recognition Based on Signaling Activity
Peer-to-Peer Application Recognition Based on Signaling ActivityPeer-to-Peer Application Recognition Based on Signaling Activity
Peer-to-Peer Application Recognition Based on Signaling Activity
 
Tcp ip
Tcp ipTcp ip
Tcp ip
 
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
Thesis : "IBBET : In Band Bandwidth Estimation for LAN"
 

Mehr von Academia Sinica

Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Academia Sinica
 
Games on Demand: Are We There Yet?
Games on Demand: Are We There Yet?Games on Demand: Are We There Yet?
Games on Demand: Are We There Yet?Academia Sinica
 
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Academia Sinica
 
Cloud Gaming Onward: Research Opportunities and Outlook
Cloud Gaming Onward: Research Opportunities and OutlookCloud Gaming Onward: Research Opportunities and Outlook
Cloud Gaming Onward: Research Opportunities and OutlookAcademia Sinica
 
Quantifying User Satisfaction in Mobile Cloud Games
Quantifying User Satisfaction in Mobile Cloud GamesQuantifying User Satisfaction in Mobile Cloud Games
Quantifying User Satisfaction in Mobile Cloud GamesAcademia Sinica
 
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值Academia Sinica
 
On The Battle between Online Gamers and Lags
On The Battle between Online Gamers and LagsOn The Battle between Online Gamers and Lags
On The Battle between Online Gamers and LagsAcademia Sinica
 
Understanding The Performance of Thin-Client Gaming
Understanding The Performance of Thin-Client GamingUnderstanding The Performance of Thin-Client Gaming
Understanding The Performance of Thin-Client GamingAcademia Sinica
 
Quantifying QoS Requirements of Network Services: A Cheat-Proof Framework
Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkQuantifying QoS Requirements of Network Services: A Cheat-Proof Framework
Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkAcademia Sinica
 
Online Game QoE Evaluation using Paired Comparisons
Online Game QoE Evaluation using Paired ComparisonsOnline Game QoE Evaluation using Paired Comparisons
Online Game QoE Evaluation using Paired ComparisonsAcademia Sinica
 
GamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming SystemGamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming SystemAcademia Sinica
 
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAre All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAcademia Sinica
 
Forecasting Online Game Addictiveness
Forecasting Online Game AddictivenessForecasting Online Game Addictiveness
Forecasting Online Game AddictivenessAcademia Sinica
 
Identifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis ApproachIdentifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis ApproachAcademia Sinica
 
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesToward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesAcademia Sinica
 
Game Bot Detection Based on Avatar Trajectory
Game Bot Detection Based on Avatar TrajectoryGame Bot Detection Based on Avatar Trajectory
Game Bot Detection Based on Avatar TrajectoryAcademia Sinica
 
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessImproving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessAcademia Sinica
 
A Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation SystemsA Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation SystemsAcademia Sinica
 
Tuning Skype’s Redundancy Control Algorithm for User Satisfaction
Tuning Skype’s Redundancy Control Algorithm for User SatisfactionTuning Skype’s Redundancy Control Algorithm for User Satisfaction
Tuning Skype’s Redundancy Control Algorithm for User SatisfactionAcademia Sinica
 
Network Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player InteractionNetwork Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player InteractionAcademia Sinica
 

Mehr von Academia Sinica (20)

Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
 
Games on Demand: Are We There Yet?
Games on Demand: Are We There Yet?Games on Demand: Are We There Yet?
Games on Demand: Are We There Yet?
 
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
Detecting In-Situ Identity Fraud on Social Network Services: A Case Study on ...
 
Cloud Gaming Onward: Research Opportunities and Outlook
Cloud Gaming Onward: Research Opportunities and OutlookCloud Gaming Onward: Research Opportunities and Outlook
Cloud Gaming Onward: Research Opportunities and Outlook
 
Quantifying User Satisfaction in Mobile Cloud Games
Quantifying User Satisfaction in Mobile Cloud GamesQuantifying User Satisfaction in Mobile Cloud Games
Quantifying User Satisfaction in Mobile Cloud Games
 
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
量化「樂趣」-以心理生理量測探究數位娛樂商品之市場價值
 
On The Battle between Online Gamers and Lags
On The Battle between Online Gamers and LagsOn The Battle between Online Gamers and Lags
On The Battle between Online Gamers and Lags
 
Understanding The Performance of Thin-Client Gaming
Understanding The Performance of Thin-Client GamingUnderstanding The Performance of Thin-Client Gaming
Understanding The Performance of Thin-Client Gaming
 
Quantifying QoS Requirements of Network Services: A Cheat-Proof Framework
Quantifying QoS Requirements of Network Services: A Cheat-Proof FrameworkQuantifying QoS Requirements of Network Services: A Cheat-Proof Framework
Quantifying QoS Requirements of Network Services: A Cheat-Proof Framework
 
Online Game QoE Evaluation using Paired Comparisons
Online Game QoE Evaluation using Paired ComparisonsOnline Game QoE Evaluation using Paired Comparisons
Online Game QoE Evaluation using Paired Comparisons
 
GamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming SystemGamingAnywhere: An Open Cloud Gaming System
GamingAnywhere: An Open Cloud Gaming System
 
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic ApproachAre All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
Are All Games Equally Cloud-Gaming-Friendly? An Electromyographic Approach
 
Forecasting Online Game Addictiveness
Forecasting Online Game AddictivenessForecasting Online Game Addictiveness
Forecasting Online Game Addictiveness
 
Identifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis ApproachIdentifying MMORPG Bots: A Traffic Analysis Approach
Identifying MMORPG Bots: A Traffic Analysis Approach
 
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay NodesToward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
Toward an Understanding of the Processing Delay of Peer-to-Peer Relay Nodes
 
Game Bot Detection Based on Avatar Trajectory
Game Bot Detection Based on Avatar TrajectoryGame Bot Detection Based on Avatar Trajectory
Game Bot Detection Based on Avatar Trajectory
 
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user TrustinessImproving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
Improving Reliability of Web 2.0-based Rating Systems Using Per-user Trustiness
 
A Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation SystemsA Collusion-Resistant Automation Scheme for Social Moderation Systems
A Collusion-Resistant Automation Scheme for Social Moderation Systems
 
Tuning Skype’s Redundancy Control Algorithm for User Satisfaction
Tuning Skype’s Redundancy Control Algorithm for User SatisfactionTuning Skype’s Redundancy Control Algorithm for User Satisfaction
Tuning Skype’s Redundancy Control Algorithm for User Satisfaction
 
Network Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player InteractionNetwork Game Design: Hints and Implications of Player Interaction
Network Game Design: Hints and Implications of Player Interaction
 

Kürzlich hochgeladen

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Detecting VoIP Traffic Based on Human Conversation Patterns

  • 1. Chen‐Chi Wu1, Kuan‐Ta Chen2 Yu‐Chun Chang1, Chin‐Laung Lei1 1Department of Electrical Engineering, National Taiwan University 2Institute of Information Science, Academia Sinica IPTComm 2008 1
  • 2. Outline Motivation Methodology Performance evaluation Summary IPTComm 2008 2
  • 3. Motivation VoIP is becoming popular because of Low call cost High voice quality Skype, a popular VoIP application over 10,000,000 concurrent users Accurately identifying VoIP flows from the network traffic  is required Traffic analysis Traffic management IPTComm 2008 3
  • 4. Motivation Challenges of VoIP flows identification Various signaling protocols: SIP, H.323, various proprietary  protocols Non‐standard port numbers Packet payload encryption The interaction of human conversation is unique result in a specific characteristic of VoIP traffic IPTComm 2008 4
  • 5. 4‐State Traffic Pattern Infer the on/off (talking/silence) pattern by the level of the  packet rate during a short period We model a two‐way conversation by a process of four states State A: a period that speaker A is talking and B is silent State B: B is talking and A is silent State D: both A and B are talking State M: mutual silence ON OFF ON OFF IPTComm 2008 5
  • 6. Intuition behind Our Approach The 4‐state traffic pattern of VoIP traffic is unique  compared to that of other network applications Web P2P (BitTorrent) Online game (WoW) TELNET VoIP (Skype) A or B D M IPTComm 2008 6
  • 7. Methodology Detect VoIP flows based on the unique human speech  conversation patterns embedded in voice traffic Derive features (attributes) from the conversation  patterns Adopt naïve Bayesian classifier, a supervised machine  learning tool, to divide traffic into the VoIP and non‐VoIP  class The class label of each training data is required IPTComm 2008 7
  • 8. Methodology Overview Training phase Identification phase Incoming flows  Labeled training flows  (unknown class) (VoIP or non‐VoIP) Extract conversation patterns and derive Extract 4‐state traffic Naïve  features    patterns and derive Bayesian features    Classifier Flow vectors Learn classifier Classify parameters Flow vectors Flow labels (VoIP or non‐VoIP)  IPTComm 2008 8
  • 9. Naïve Bayesian Classifier Naïve Bayesian classifier is based on the Bayes’ theorem P( B | A) P( A) P( A | B) = P( B) Each flow is represented by a vector  X = (x1, x2,…, xn),   depicting n features A1, A2,…, An Suppose there are m classes, C1, C2,…, Cm IPTComm 2008 9
  • 10. Naïve Bayesian Classifier Given a flow vector X, the classifier predicts the flow  belongs to class Ci iff P (C i | X ) > P (C j | X ) for 1 ≤ j ≤ m, j ≠ i By Bayes’ theorem P( X | Ci ) P(Ci ) P(Ci | X ) = P( X ) P ( X ) is constant and             is the prior probability, thus  P (Ci ) the task is to maximize P ( X | Ci ) IPTComm 2008 10
  • 11. Naïve Bayesian Classifier The naïve assumption is that the values of the features  are conditionally independent of one another n P ( X | C i ) = ∏ P ( x k | Ci ) k =1 = P( x1 | Ci ) × P ( x2 | Ci ) × × P ( x n | Ci ) P ( x1 | Ci ), P ( x2 | Ci ),..., P ( xn | Ci ) can be easily estimated  from the training data IPTComm 2008 11
  • 12. How to derive features from the 4‐ state traffic pattern? Use a Markov chain to model the VoIP traffic pattern Statistics of traffic patterns Web P2P WoW TELNET VoIP A or B D M IPTComm 2008 12
  • 13. Markov Chain Build a Markov chain model based on a set of known VoIP  traffic patterns Derive a feature – likelihood value Transition probabilities of the Markov chain A B D M A 0.9022 0.0028 0.0380 0.0571 B 0.0029 0.9030 0.0391 0.0550 D 0.0607 0.0592 0.8763 0.0038 M 0.0465 0.0439 0.0019 0.9078 4‐state Markov chain IPTComm 2008 13
  • 14. Likelihood of Traffic Patterns Given a traffic pattern with a state sequence S1, S2,…, Sn,  where Si ∈ { A, B, D, M } Compute the log‐likelihood value as log( P , 2 × P2,3 × × P( n −1) n ) 1 Pi,j : the transition probability from Si to Sj Traffic flows may vary in length, thus define the  normalized log‐likelihood value as log( P , 2 × P2,3 × × P( n −1) n ) 1 N N: the length of the sequence IPTComm 2008 14
  • 15. Likelihood of Traffic Patterns The Markov chain represents typical human conversation VoIP flows => large log‐likelihood value Non‐VoIP flows => low log‐likelihood value Exhibit non‐human‐like behavior: non‐interactive,  independent, unidirectional IPTComm 2008 15
  • 16. Statistics of Traffic Patterns Mean of the period that party A (or B) is ON (talking) each  time (also compute the standard deviation) Bidirectional behavior Mean and standard deviation of the sojourn time in  states A, B, D, M, respectively Interactive behavior State alternation frequency Fragmented and disordered level of traffic pattern IPTComm 2008 16
  • 17. Statistics of Traffic Patterns State alternation frequency Alternation frequency between different states E.g., (6 alternations between different states) / (20 sec.) IPTComm 2008 17
  • 18. Feature Summary Feature set Normalized log‐likelihood value based on the Markov  chain Speech period of party A or B (mean, standard deviation) Sojourn time in each states* (mean, standard deviation) Ratio of sojourn time in each states* Alternation rate between states* *states A, B, D, M IPTComm 2008 18
  • 19. Methodology Training phase Identification phase Incoming flows  Labeled training flows  (unknown class) (VoIP or non‐VoIP) Extract conversation patterns and derive Extract 4‐state traffic Naïve  features    patterns and derive Bayesian features    Classifier Flow vectors Learn classifier Classify parameters Flow vectors Flow labels (VoIP or non‐VoIP)  IPTComm 2008 19
  • 20. Trace Collection We collected network traffic from 5 categories of  applications VoIP (Skype), TELNET, Web, P2P (BitTorrent), online game  (World of Warcraft) Category # Connections Duration # Packets Bytes VoIP 462 2,388 (min) 4,728,240 4,318 (MB) TELNET 2,008 4,729 (min) 10,559,261 7,331 (MB) Web 1,406 1,537 (min) 2,528,359 680 (MB) P2P 15,845 3,334 (min) 29,220,870 30,500 (MB) Online game 2,224 120 (min) 28,264,360 59,097 (MB) IPTComm 2008 20
  • 21. Performance Evaluation Detect VoIP flows as early as possible Detection time is a major concern 95% accuracy with 4‐second detection time 97% accuracy with 11‐second detection time IPTComm 2008 21
  • 22. Performance Evaluation Goal        detect VoIP flows VoIP flows         positives, non‐VoIP flows        negatives True positive rate The  number  of  VoIP  flows  correctly  identified TPR = The  number  of  total  VoIP  flows False positive rate The  number  of  non ‐ VoIP  flows  correctly  identified FPR = The  number  of  total  non ‐ VoIP  flows True negative rate IPTComm 2008 22
  • 23. Performance Evaluation 97% TPR with a detection time longer than 3 sec. Flows of World of Warcraft tend to be mis‐identified Achieve 90% TNR with a detection time longer than 10 sec. IPTComm 2008 23
  • 24. ROC Curves ROC (Receiver Operating Characteristic) IPTComm 2008 24
  • 25. Summary Propose a VoIP flow identification scheme based on  human conversation patterns Our scheme yields an identification accuracy 95% within  4 sec. of the detection time, and 97% within 11 sec. High accuracy in short detection time IPTComm 2008 25