SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Real-time Non-Intrusive Speech
Quality Estimation of VoIP Using
Genetic Programming
Muhammad Adil Raja
University of Limerick
Outline
• Introduction
• Voice over Internet Protocol (VoIP)
• Approaches to Speech Quality estimation
• Genetic Programming
• Real-time, Non-intrusive Evaluation of VoIP
Outline ...
• A Methodology for Deriving VoIP Equipment
Impairment Factors for a Mixed NB/WB Context
• A Signal-based Model
• Conclusion
Introduction
• VoIP -- a paradigm shift
• Bandwidth redundancy exploitation
• QoS remains dominated by network/transport layer
metrics
• Quality Assessment - a reflection upon the operating
conditions of the network
Research Goals
• Derivation of non-intrusive parametric models for
speech quality estimation
• Derivation of a signal-based non-intrusive model
• Genetic programming based symbolic regression was
used
VoIP
• Packet based communication channel
• Uses wire-line speech codecs
• Linear predictive coding (LPC) is popular
• Coded frames are packetized into RTP/UDP
• Internet is used for transportation
• The receiver does the reverse process
Speech Quality
• Two Approaches to Speech Quality Assessment
★ Subjective Assessment
★ Objective Assessment
Subjective Assessment
Speech Quality
• Speech Quality is Estimated By Humans
• Advantage - Reliable Results
• Limitations
1. Expensive
2. Time Consuming
3. Laborious
4. Lack of Repeatability
• Mean Opinion Score (MOS) is the Measure of Quality
• 1 - Bad
• 5 - Excellent
Objective Assessment of Speech Quality
Speech Quality
• A computer automated fast and reliable program is
used to assay human perception of speech quality
• Two approaches:
1. Intrusive Assessment
2. Non-Intrusive Assessment
Intrusive Assessment
Objective Assessment of Speech Quality
• The signal under test is compared against a reference
signal
• Advantages
1. The most reliable artificial means of assessing the
speech quality
2. Tests can be repeated easily
• Limitations:
1. Consumes considerable computing resources
2. Not useful for continuous monitoring of quality due
to requirement of a reference signal
ITU-T P.862 (PESQ)
Objective Assessment of Speech Quality
• PESQ algorithm is the current ITU-T recommendation
for intrusive speech quality estimation
• The speech signal is mapped from time domain to time-
frequency representation using the psycho-physical
equivalents of time and frequency
ITU-T P.862 (PESQ)
Objective Assessment of Speech Quality
• It has shown a high correlation with ITU-T benchmark
tests
• For 30 ITU-T subjective tests the Pearson’s correlation
coefficient (R) was 0.935
Non-Intrusive Assessment
Objective Assessment of Speech Quality
• A challenging problem since a reference signal is not
available
• Two approaches exist
1. Parametric models
2. Signal-based models
Parametric Models
Objective Assessment of Speech Quality
• Function of transport layer metrics and other measurable
quantities
• Cogent metrics may be:
★ Packet loss rate
★ End-to-end delay
★ Delay variation - jitter
★ Codecs characteristics
★ ...
• Aimed at real-time and continuous evaluation of speech
quality
Signal-based Models
Objective Assessment of Speech Quality
• Recent approaches are based on emulating
1. Human speech production mechanism
2. Psycho-acoustic processing of human hearing
• ITU-T P.563 is the current recommendation
Introduction
Genetic Programing (GP)
• GP is a machine learning technique inspired by
biological evolution.
• Aimed at evolving program expressions/computer
code
• Each individual encodes a symbolic expression
• Solution representation
★ Tree structure - popular
★ Graphs, linear structures (arrays) etc
• Primary application area is modeling
Applications
Genetic Programing (GP)
1. Circuit design
2. Controllers
3. Antennas
4. Artificial chemistry
5. Computer hardware design
6. Network coding
7. Digital filter design
8. Computer aided diagnosis
9. Signal processing applications
10. ...
A Simplified GP Breeding Cycle
Genetic Programing (GP)
1. Generate an initial population of random compositions
of the functions and terminals of the problem
(computer programs)
★ Functions: +, -, *, /, sin, cos, log, power etc
★ Terminals: Can be variables (network traffic
parameters) and constants
2. Execute each program in the population and assign it a
fitness value
3. Copy the best existing programs (selection).
4. Create new computer programs by mutation and
crossover
✴ Repeat steps 2-4 till the desired solution is found
A Simplified GP Breeding Cycle: A Symbolic Representation
Genetic Programing (GP)
Real-time, Non-intrusive Evaluation of VoIP
• Derivation of non-intrusive parametric models for
speech quality estimation
No Parameter Name Abbreviation
1 bit-rate (kbps) br
2 mean loss rate mlr
3 mean burst length mbl
4
packetization interval
(ms)
PI
5 frame duration (ms) fd
Simulation Environment
Real-time, Non-intrusive Evaluation of VoIP
The Models
MOS − LQOGP = −2.46 × log(cos(log(br)) + mlrV AD × (br + fd/10)) + 3.17
Real-time, Non-intrusive Evaluation of VoIP
MOS − LQOGP = −2.99 × cos(0.91 × sin(mlrV AD) + mlrV AD + 8) + 4.20
Data
Equation 1 Equation 2
MSE MSE
Training 0.037 0.9634 0.052 0.9481
Testing 0.0387 0.9646 0.0541 0.9501
Validation 0.0382 0.9688 0.0541 0.9531
σ σ
Scatter Plots
Real-time, Non-intrusive Evaluation of VoIP
Comparison With ITU-T P.563
Real-time, Non-intrusive Evaluation of VoIP
A Methodology for Deriving VoIP Equipment
Impairment Factors for a Mixed NB/WB Context
• VoIP is rapidly evolving towards wideband (WB)
transmission
• WB offers more natural sounding speech
• However, a period of coexistence between NB and WB
codecs would prevail
• Research focus was at evolving impairment factors for
ITU-T G.107, the E-model, for a mixed NB/WB context
ITU-T G.107 The E-Model
Equipment Impairment Factors for a Mixed Context
• A parametric model
• Based on an impairment factor principle
• Effect of impairments on quality is additive
• R scale: R=R0 - Is - Id - Ie,eff + A
• Initially designed for NB Telephony
• Extension to WB (or NB/WB)
Equipment Impairment Factors for a Mixed Context
• Effective equipment
impairments:
• Codec related
• loss rate (mean)
• Burstiness
• Payload size (PI)
0 5 10 15 20 25 30 35 40
20
30
40
50
60
70
80
90
100
110
120
mean loss rate (mlr) %
I
e,WB,eff
G.729 (8)
G.723.1(6.3)
AMR!NB (7.4)
G.722.2 (19.85)
G.722.1 (32)
The Models
Equipment Impairment Factors for a Mixed Context
Ie,W B,eff = (1)
{11 − mbl + ln(grad) + grad × mlr + Ie,W B
−2.log2(PI)} × 0.8619 + 9
Ie,W B,eff = (2)
ln
9 × (Ie,W B + mlr × grad2
)
mbl5 − mlr
+ mlr + Ie,W B
+grad × mlr} × 0.8303 + 8.9977
Ie,W B,eff = (3)
(log10(log10(log2(Ie,W B − 2 × mbl) + mlr)))
×321.7017 + 95.3708
Test Results
Equipment Impairment Factors for a Mixed Context
0 20 40 60 80 100 120 140
0
20
40
60
80
100
120
140
I
e,WB,eff
!WB!PESQ
I
e,WB,eff
!GP
0 20 40 60 80 100 120 140
0
20
40
60
80
100
120
140
I
e,WB,eff
!WB!PESQ
I
e,WB,eff
!GP
Model
Training Testing
RMSE RMSE
1 8.3941 0.9236 8.5057 0.9240
2 8.3552 0.9243 8.4605 0.9248
3 9.1745 0.908 9.3145 0.9080
σ σ
The Proposed Model
A Signal-Based Model
• ITU-T P.563 has been chosen for feature extraction
• Reasons:
i. P.563 is the current state-of-the-art algorithm for
speech quality estimation
ii. It computes the most numerous and varied features
relevant to speech quality
• A new mapping is derived by employing symbolic
regression
GP Experiments
A Signal-Based Model
• Three GP experiments were performed with various
configurations
• Leaf coefficients of GP trees were tuned with a GA -
hybrid optimization
Distortion Conditions
A Signal-Based Model
• Signal correlated noise
• Frame erasures
• Bit errors
• transcoding
• Front-end clipping
• Low bitrate coding
• Speech level variation
Results - Comparison With ITU-T P.563
A Signal-Based Model
ITU-T P.563 GP based model
Percentage
Enhancement
Training 0.3937 0.3415 9.89
Testing 0.3674 0.3071 16.41
Conclusions
• Research goal: derivation of real-time non-intrusive
models for speech quality estimation
• GP has been employed to achieve this
• Disparate parametric and signal-based models have
been derived
• Models outperform ITU-T’s standard recommendations
Thankyou

Weitere ähnliche Inhalte

Was ist angesagt?

DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITIONDEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
niranjan kumar
 
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
sipij
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPC
Disha Modi
 
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
gt_ebuddy
 
Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...
eSAT Journals
 

Was ist angesagt? (20)

Design of Low Pass Digital FIR Filter Using Cuckoo Search Algorithm
Design of Low Pass Digital FIR Filter Using Cuckoo Search AlgorithmDesign of Low Pass Digital FIR Filter Using Cuckoo Search Algorithm
Design of Low Pass Digital FIR Filter Using Cuckoo Search Algorithm
 
I1035563
I1035563I1035563
I1035563
 
Unit v transfer learning
Unit v transfer  learningUnit v transfer  learning
Unit v transfer learning
 
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)
 
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
CANONIC SIGNED DIGIT BASED DESIGN OF MULTIPLIER-LESS FIR FILTER USING SELFORG...
 
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITIONDEVELOPMENT OF SPEAKER VERIFICATION  UNDER LIMITED DATA AND CONDITION
DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
 
Comparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniquesComparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniques
 
Comparative study of compression techniques for synthetic videos
Comparative study of compression techniques for synthetic videosComparative study of compression techniques for synthetic videos
Comparative study of compression techniques for synthetic videos
 
A literature review on improving speech intelligibility in noisy environment
A literature review on improving speech intelligibility in noisy environmentA literature review on improving speech intelligibility in noisy environment
A literature review on improving speech intelligibility in noisy environment
 
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
 
Ph.D. thesis defense
Ph.D. thesis defensePh.D. thesis defense
Ph.D. thesis defense
 
Text independent speaker recognition system
Text independent speaker recognition systemText independent speaker recognition system
Text independent speaker recognition system
 
Speech based password authentication system on FPGA
Speech based password authentication system on FPGASpeech based password authentication system on FPGA
Speech based password authentication system on FPGA
 
Speech Compression using LPC
Speech Compression using LPCSpeech Compression using LPC
Speech Compression using LPC
 
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recogn...
 
Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...Design and implementation of different audio restoration techniques for audio...
Design and implementation of different audio restoration techniques for audio...
 
A survey report for performance analysis of finite
A survey report for performance analysis of finiteA survey report for performance analysis of finite
A survey report for performance analysis of finite
 
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
 
Sparse Approximation of Gram Matrices for GMMN-based Speech Synthesis
Sparse Approximation of Gram Matrices for GMMN-based Speech SynthesisSparse Approximation of Gram Matrices for GMMN-based Speech Synthesis
Sparse Approximation of Gram Matrices for GMMN-based Speech Synthesis
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learning
 

Ähnlich wie Real-time Non-Intrusive Speech Quality Estimation of VoIP Using Genetic Programming

Real-time, non-Intrusive Evaluation of VoIP
Real-time, non-Intrusive Evaluation of VoIPReal-time, non-Intrusive Evaluation of VoIP
Real-time, non-Intrusive Evaluation of VoIP
adil raja
 
Thesis arpan pal_gisfi
Thesis arpan pal_gisfiThesis arpan pal_gisfi
Thesis arpan pal_gisfi
Arpan Pal
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
Zbigniew Jerzak
 
Lecture 2- Practical AD and DA Conveters (Online Learning).pptx
Lecture 2- Practical AD and DA Conveters (Online Learning).pptxLecture 2- Practical AD and DA Conveters (Online Learning).pptx
Lecture 2- Practical AD and DA Conveters (Online Learning).pptx
HamzaJaved306957
 
Summer Research Project. Final Presentation 2013
Summer Research Project. Final Presentation 2013Summer Research Project. Final Presentation 2013
Summer Research Project. Final Presentation 2013
Ojaswa Anand
 
Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...
Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...
Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...
tumep
 

Ähnlich wie Real-time Non-Intrusive Speech Quality Estimation of VoIP Using Genetic Programming (20)

Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
 
Real-time, non-Intrusive Evaluation of VoIP
Real-time, non-Intrusive Evaluation of VoIPReal-time, non-Intrusive Evaluation of VoIP
Real-time, non-Intrusive Evaluation of VoIP
 
An Evolutionary Approach to Speech Quality Estimation
An Evolutionary Approach to Speech Quality EstimationAn Evolutionary Approach to Speech Quality Estimation
An Evolutionary Approach to Speech Quality Estimation
 
An Evolutionary Approach to Speech Quality Estimation Using Genetic Programming
An Evolutionary Approach to Speech Quality Estimation Using Genetic ProgrammingAn Evolutionary Approach to Speech Quality Estimation Using Genetic Programming
An Evolutionary Approach to Speech Quality Estimation Using Genetic Programming
 
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
 
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
 
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
Modeling the Effect of Packet Loss on Speech Quality: Genetic Programming Bas...
 
Thesis arpan pal_gisfi
Thesis arpan pal_gisfiThesis arpan pal_gisfi
Thesis arpan pal_gisfi
 
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingRealtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Realtime, Non-Intrusive Evaluation of VoIP Using Genetic Programming
 
Real-Time, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Real-Time, Non-Intrusive Evaluation of VoIP Using Genetic ProgrammingReal-Time, Non-Intrusive Evaluation of VoIP Using Genetic Programming
Real-Time, Non-Intrusive Evaluation of VoIP Using Genetic Programming
 
Network and Multimedia QoE Management
Network and Multimedia QoE ManagementNetwork and Multimedia QoE Management
Network and Multimedia QoE Management
 
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...Optimization of Continuous Queries in Federated Database and Stream Processin...
Optimization of Continuous Queries in Federated Database and Stream Processin...
 
OOSE Unit 2 PPT.ppt
OOSE Unit 2 PPT.pptOOSE Unit 2 PPT.ppt
OOSE Unit 2 PPT.ppt
 
Lecture 2- Practical AD and DA Conveters (Online Learning).pptx
Lecture 2- Practical AD and DA Conveters (Online Learning).pptxLecture 2- Practical AD and DA Conveters (Online Learning).pptx
Lecture 2- Practical AD and DA Conveters (Online Learning).pptx
 
Summer Research Project. Final Presentation 2013
Summer Research Project. Final Presentation 2013Summer Research Project. Final Presentation 2013
Summer Research Project. Final Presentation 2013
 
Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015Optimization of Incremental Queries CloudMDE2015
Optimization of Incremental Queries CloudMDE2015
 
Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...
Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...
Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...
 
Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...
Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...
Adaptive Optimization Schemes for Mobile VoIP Applications - Battery Life and...
 

Mehr von adil raja

Mehr von adil raja (20)

ANNs.pdf
ANNs.pdfANNs.pdf
ANNs.pdf
 
A Software Requirements Specification
A Software Requirements SpecificationA Software Requirements Specification
A Software Requirements Specification
 
NUAV - A Testbed for Development of Autonomous Unmanned Aerial Vehicles
NUAV - A Testbed for Development of Autonomous Unmanned Aerial VehiclesNUAV - A Testbed for Development of Autonomous Unmanned Aerial Vehicles
NUAV - A Testbed for Development of Autonomous Unmanned Aerial Vehicles
 
DevOps Demystified
DevOps DemystifiedDevOps Demystified
DevOps Demystified
 
On Research (And Development)
On Research (And Development)On Research (And Development)
On Research (And Development)
 
Simulators as Drivers of Cutting Edge Research
Simulators as Drivers of Cutting Edge ResearchSimulators as Drivers of Cutting Edge Research
Simulators as Drivers of Cutting Edge Research
 
The Knock Knock Protocol
The Knock Knock ProtocolThe Knock Knock Protocol
The Knock Knock Protocol
 
File Transfer Through Sockets
File Transfer Through SocketsFile Transfer Through Sockets
File Transfer Through Sockets
 
Remote Command Execution
Remote Command ExecutionRemote Command Execution
Remote Command Execution
 
Thesis
ThesisThesis
Thesis
 
CMM Level 3 Assessment of Xavor Pakistan
CMM Level 3 Assessment of Xavor PakistanCMM Level 3 Assessment of Xavor Pakistan
CMM Level 3 Assessment of Xavor Pakistan
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...
Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...
Implementation of a Non-Intrusive Speech Quality Assessment Tool on a Mid-Net...
 
Real-Time Non-Intrusive Speech Quality Estimation for VoIP
Real-Time Non-Intrusive Speech Quality Estimation for VoIPReal-Time Non-Intrusive Speech Quality Estimation for VoIP
Real-Time Non-Intrusive Speech Quality Estimation for VoIP
 
VoIP
VoIPVoIP
VoIP
 
ULMAN GUI Specifications
ULMAN GUI SpecificationsULMAN GUI Specifications
ULMAN GUI Specifications
 
ULMAN-GUI
ULMAN-GUIULMAN-GUI
ULMAN-GUI
 
Modeling the Effect of packet Loss on Speech Quality: GP Based Symbolic Regre...
Modeling the Effect of packet Loss on Speech Quality: GP Based Symbolic Regre...Modeling the Effect of packet Loss on Speech Quality: GP Based Symbolic Regre...
Modeling the Effect of packet Loss on Speech Quality: GP Based Symbolic Regre...
 
Modelling the Effect of Packet Loss on Speech Quality
Modelling the Effect of Packet Loss on Speech QualityModelling the Effect of Packet Loss on Speech Quality
Modelling the Effect of Packet Loss on Speech Quality
 
A Random Presentation
A Random PresentationA Random Presentation
A Random Presentation
 

Kürzlich hochgeladen

DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLEGEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
GEAR TRAIN- BASIC CONCEPTS AND WORKING PRINCIPLE
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Wadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptxWadi Rum luxhotel lodge Analysis case study.pptx
Wadi Rum luxhotel lodge Analysis case study.pptx
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 

Real-time Non-Intrusive Speech Quality Estimation of VoIP Using Genetic Programming

  • 1. Real-time Non-Intrusive Speech Quality Estimation of VoIP Using Genetic Programming Muhammad Adil Raja University of Limerick
  • 2. Outline • Introduction • Voice over Internet Protocol (VoIP) • Approaches to Speech Quality estimation • Genetic Programming • Real-time, Non-intrusive Evaluation of VoIP
  • 3. Outline ... • A Methodology for Deriving VoIP Equipment Impairment Factors for a Mixed NB/WB Context • A Signal-based Model • Conclusion
  • 4. Introduction • VoIP -- a paradigm shift • Bandwidth redundancy exploitation • QoS remains dominated by network/transport layer metrics • Quality Assessment - a reflection upon the operating conditions of the network
  • 5. Research Goals • Derivation of non-intrusive parametric models for speech quality estimation • Derivation of a signal-based non-intrusive model • Genetic programming based symbolic regression was used
  • 6. VoIP • Packet based communication channel • Uses wire-line speech codecs • Linear predictive coding (LPC) is popular • Coded frames are packetized into RTP/UDP • Internet is used for transportation • The receiver does the reverse process
  • 7. Speech Quality • Two Approaches to Speech Quality Assessment ★ Subjective Assessment ★ Objective Assessment
  • 8. Subjective Assessment Speech Quality • Speech Quality is Estimated By Humans • Advantage - Reliable Results • Limitations 1. Expensive 2. Time Consuming 3. Laborious 4. Lack of Repeatability • Mean Opinion Score (MOS) is the Measure of Quality • 1 - Bad • 5 - Excellent
  • 9. Objective Assessment of Speech Quality Speech Quality • A computer automated fast and reliable program is used to assay human perception of speech quality • Two approaches: 1. Intrusive Assessment 2. Non-Intrusive Assessment
  • 10. Intrusive Assessment Objective Assessment of Speech Quality • The signal under test is compared against a reference signal • Advantages 1. The most reliable artificial means of assessing the speech quality 2. Tests can be repeated easily • Limitations: 1. Consumes considerable computing resources 2. Not useful for continuous monitoring of quality due to requirement of a reference signal
  • 11. ITU-T P.862 (PESQ) Objective Assessment of Speech Quality • PESQ algorithm is the current ITU-T recommendation for intrusive speech quality estimation • The speech signal is mapped from time domain to time- frequency representation using the psycho-physical equivalents of time and frequency
  • 12. ITU-T P.862 (PESQ) Objective Assessment of Speech Quality • It has shown a high correlation with ITU-T benchmark tests • For 30 ITU-T subjective tests the Pearson’s correlation coefficient (R) was 0.935
  • 13. Non-Intrusive Assessment Objective Assessment of Speech Quality • A challenging problem since a reference signal is not available • Two approaches exist 1. Parametric models 2. Signal-based models
  • 14. Parametric Models Objective Assessment of Speech Quality • Function of transport layer metrics and other measurable quantities • Cogent metrics may be: ★ Packet loss rate ★ End-to-end delay ★ Delay variation - jitter ★ Codecs characteristics ★ ... • Aimed at real-time and continuous evaluation of speech quality
  • 15. Signal-based Models Objective Assessment of Speech Quality • Recent approaches are based on emulating 1. Human speech production mechanism 2. Psycho-acoustic processing of human hearing • ITU-T P.563 is the current recommendation
  • 16. Introduction Genetic Programing (GP) • GP is a machine learning technique inspired by biological evolution. • Aimed at evolving program expressions/computer code • Each individual encodes a symbolic expression • Solution representation ★ Tree structure - popular ★ Graphs, linear structures (arrays) etc • Primary application area is modeling
  • 17. Applications Genetic Programing (GP) 1. Circuit design 2. Controllers 3. Antennas 4. Artificial chemistry 5. Computer hardware design 6. Network coding 7. Digital filter design 8. Computer aided diagnosis 9. Signal processing applications 10. ...
  • 18. A Simplified GP Breeding Cycle Genetic Programing (GP) 1. Generate an initial population of random compositions of the functions and terminals of the problem (computer programs) ★ Functions: +, -, *, /, sin, cos, log, power etc ★ Terminals: Can be variables (network traffic parameters) and constants 2. Execute each program in the population and assign it a fitness value 3. Copy the best existing programs (selection). 4. Create new computer programs by mutation and crossover ✴ Repeat steps 2-4 till the desired solution is found
  • 19. A Simplified GP Breeding Cycle: A Symbolic Representation Genetic Programing (GP)
  • 20. Real-time, Non-intrusive Evaluation of VoIP • Derivation of non-intrusive parametric models for speech quality estimation No Parameter Name Abbreviation 1 bit-rate (kbps) br 2 mean loss rate mlr 3 mean burst length mbl 4 packetization interval (ms) PI 5 frame duration (ms) fd
  • 22. The Models MOS − LQOGP = −2.46 × log(cos(log(br)) + mlrV AD × (br + fd/10)) + 3.17 Real-time, Non-intrusive Evaluation of VoIP MOS − LQOGP = −2.99 × cos(0.91 × sin(mlrV AD) + mlrV AD + 8) + 4.20 Data Equation 1 Equation 2 MSE MSE Training 0.037 0.9634 0.052 0.9481 Testing 0.0387 0.9646 0.0541 0.9501 Validation 0.0382 0.9688 0.0541 0.9531 σ σ
  • 24. Comparison With ITU-T P.563 Real-time, Non-intrusive Evaluation of VoIP
  • 25. A Methodology for Deriving VoIP Equipment Impairment Factors for a Mixed NB/WB Context • VoIP is rapidly evolving towards wideband (WB) transmission • WB offers more natural sounding speech • However, a period of coexistence between NB and WB codecs would prevail • Research focus was at evolving impairment factors for ITU-T G.107, the E-model, for a mixed NB/WB context
  • 26. ITU-T G.107 The E-Model Equipment Impairment Factors for a Mixed Context • A parametric model • Based on an impairment factor principle • Effect of impairments on quality is additive • R scale: R=R0 - Is - Id - Ie,eff + A • Initially designed for NB Telephony • Extension to WB (or NB/WB)
  • 27. Equipment Impairment Factors for a Mixed Context • Effective equipment impairments: • Codec related • loss rate (mean) • Burstiness • Payload size (PI) 0 5 10 15 20 25 30 35 40 20 30 40 50 60 70 80 90 100 110 120 mean loss rate (mlr) % I e,WB,eff G.729 (8) G.723.1(6.3) AMR!NB (7.4) G.722.2 (19.85) G.722.1 (32)
  • 28. The Models Equipment Impairment Factors for a Mixed Context Ie,W B,eff = (1) {11 − mbl + ln(grad) + grad × mlr + Ie,W B −2.log2(PI)} × 0.8619 + 9 Ie,W B,eff = (2) ln 9 × (Ie,W B + mlr × grad2 ) mbl5 − mlr + mlr + Ie,W B +grad × mlr} × 0.8303 + 8.9977 Ie,W B,eff = (3) (log10(log10(log2(Ie,W B − 2 × mbl) + mlr))) ×321.7017 + 95.3708
  • 29. Test Results Equipment Impairment Factors for a Mixed Context 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 I e,WB,eff !WB!PESQ I e,WB,eff !GP 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 I e,WB,eff !WB!PESQ I e,WB,eff !GP Model Training Testing RMSE RMSE 1 8.3941 0.9236 8.5057 0.9240 2 8.3552 0.9243 8.4605 0.9248 3 9.1745 0.908 9.3145 0.9080 σ σ
  • 30. The Proposed Model A Signal-Based Model • ITU-T P.563 has been chosen for feature extraction • Reasons: i. P.563 is the current state-of-the-art algorithm for speech quality estimation ii. It computes the most numerous and varied features relevant to speech quality • A new mapping is derived by employing symbolic regression
  • 31. GP Experiments A Signal-Based Model • Three GP experiments were performed with various configurations • Leaf coefficients of GP trees were tuned with a GA - hybrid optimization
  • 32. Distortion Conditions A Signal-Based Model • Signal correlated noise • Frame erasures • Bit errors • transcoding • Front-end clipping • Low bitrate coding • Speech level variation
  • 33. Results - Comparison With ITU-T P.563 A Signal-Based Model ITU-T P.563 GP based model Percentage Enhancement Training 0.3937 0.3415 9.89 Testing 0.3674 0.3071 16.41
  • 34. Conclusions • Research goal: derivation of real-time non-intrusive models for speech quality estimation • GP has been employed to achieve this • Disparate parametric and signal-based models have been derived • Models outperform ITU-T’s standard recommendations