SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
Voice Quality Impairments Detection
Introduction
The purpose of this document is to define a vocabulary that can be used to discuss
symptoms of voice quality problems detection.
This document is intended to be a living resource in that the detection of symptoms listed are
expected to be revised as new problems arise and additional information becomes available.

Signal-to-noise ratio
Signal-to-noise ratio (often abbreviated SNR or S/N) is a measure used in science and engineering that compares
the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the
noise power. A ratio higher than 1:1 indicates more signal than noise. While SNR is commonly quoted for
electrical signals, it can be applied to any form of signal (such as isotope levels in an ice core or biochemical
signaling between cells).
The signal-to-noise ratio, the bandwidth, and the channel capacity of a communication channel are connected by
the Shannon–Hartley theorem.
Signal-to-noise ratio is sometimes used informally to refer to the ratio of useful information to false or irrelevant
data in a conversation or exchange. For example, in online discussion forums and other online communities,
off-topic posts and spam are regarded as "noise" that interferes with the "signal" of appropriate discussion.
Telecommunication systems strive to increase the ratio of signal level to noise level in order to effectively
transmit data. In practice, if the transmitted signal falls below the level of the noise (often designated as the
noise floor) in the system, data can no longer be decoded at the receiver. Noise in telecommunication systems is
a product of both internal and external sources to the system
We recommend SNR to be not lower than 25dB in speech signal.

Absolute silence
This type of impairment relates to silence between speech whenone cannot recognize whether the
other person is still there because there is no sound on the line.
A common cause for this problem is Voice Activity Detection (VAD) without comfort noise. In order to
experience this symptom, usually the background noise is loud enough for the silence insertion to be
noticeable but soft enough so that VAD is engaged.
Silence appearing during a phone call is considered an artifact that is associated with connection loss.
Therefore one can set energy threshold on a frame and when it goes below threshold value one starts
Copyright © Sevana, 2013
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165

Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
calculating duration of the silent fragment of the signal. If one receives a “loud” frame then the
counter for silence is reset to zero. If counter value becomes greater than f.e. 1 second then we can
notify about detecting absolute silence impairment and quality loss due to silent fragments..

Loudness
This impairment is related to too loud or too silent calls. In case signal energy changes significantly
one may consider the usual call quality has been degraded. Most recent version of the library works
together with VAD to detect too loud call fragments by calculating average energy values from active
signal fragments (frames), and when the average exceed predefined threshold this indicates that the
signal is too loud.

Amplitude clipping
Amplitude clipping impairment or the so called “buzziness” is related to the fact if the signal amplitude
is too high at some point along the analog voice path, when the voice signal is converted to a digital
form amplitude clipping can occur. Users report that speech may seem excessively loud and potentially
"buzzy" or "fuzzy". One can find a sample of amplitude clipped audio at this link:
http://www.voiptroubleshooter.com/sound_files/amplitude_clipping.wav
In case amount of clipped samples is higher than 2% the audio quality gets considerably lower:
1) To take integral result over a frame one must check dClpLevel and dClpLevelWide. We may consider
single clipped frames as non-critical impairments for overall quality, because it may be due to energy
normalization only. However, if at the same time we have clipped sequences of samples then we face
real clipping impairment. We must alert if dClpLevel > 2% and dClpLevelWide > 0. One can also set a
threshold for dClpLevelWide, f.e. 5% from dClpLevel.
2) For real-time monitoring one should check dFrameClpLevel&dFrameClpLevelWide and
dFlyClpLevel&dFlyClpLevelWide. If we have clipping on a single frame and there are no clipped samples
to the left and to the right from it, this impairment we may identify as a click. Temporary quality
degradation is characterized by clipping in longer parts of the signal. These parts one may
characterize as significant increase of the input signal loudness.

Clicking
Clicking impairment is related to a short time period energy increase - click. If clicks appear more often
than in 3-5 seconds then we have audio quality degradation, what should cause a clicking alert.

Stuck
Stuck means appearance of a relatively constant amplitude level of the signal. Stuck signal one
percepts as absolute silence, what is not typical for speech. Depending on energy change it may also
be percepted as a click. We recommend to set the same threshold for Stuck as for Clicking: not more
than 1 stuck impairment during 3-5 seconds. However, if stuck duration is more than 10% of the whole
audio this is also a signal of significant quality degradation with stuck impairment.

VAD clipping
This impairment detects incorrect work of Voice Activity Detector (VAD). Detector finds edges of
active and inactive fragments of the signal considering VAD worked too late (in the beginning of the
speech) or too early (in the end of the speech).
Copyright © Sevana, 2013
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165

Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
Let us calculate number of changes of VAD (i.e. voice/no voice) and consider that number is X, then
the following formular
100 * dNumClpFrames/X
calculates a metric, which should not exceed 10% for acceptable speech quality.

Echo
Signal reflection (echo) occurs when a signal is transmitted along a transmission medium, such as a
copper cable or an optical fiber. Some of the signal power may be reflected back to its origin rather
than being carried all the way along the cable to the far end. This happens because imperfections in
the cable cause impedance mismatches and non-linear changes in the cable characteristics. These
abrupt changes in characteristics cause some of the transmitted signal to be reflected. The ratio of
energy bounced back depends on the impedance mismatch. Mathematically, it is defined using the
reflection coefficient.
In telecommunications, the reflection coefficient is the ratio of the amplitude of the reflected wave to
the amplitude of the incident wave. In particular, at a discontinuity in a transmission line, it is the
complex ratio of the electric field strength of the reflected wave (
) to that of the incident wave (
). This is typically represented with a

(capital gamma) and can be written as:

The reflection coefficient may also be established using other field or circuit quantities.
The reflection coefficient can be given by the equations below, where
the source,

is the impedance toward

is the impedance toward the load:

Notice that a negative reflection coefficient means that the reflected wave receives a 180°, or
phase shift.

,

The absolute magnitude (designated by vertical bars) of the reflection coefficient can be calculated
from the standing wave ratio, SWR:

The reflection coefficient range is from -1 to +1
There are two algorithms to detect echo implemented: correlation based and echo compensation
based. Selecting one of them is possible during library compilation.
In case of echo compensation based algorithm one should compare echo energy versus signal energy
and if echo energy is more than 20% from the signal energy then we detect echo presence in the
speech signal. One can also check VAD and compare energy values only when VAD is active.
Copyright © Sevana, 2013
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165

Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178
In case of correlation based algorithm one should consider similarity. We can say echo is present if
correlation is higher than 0.7 and at the same time check signal energy level: in case it’s low then
echo is not present (false positive). One can also consider VAD instead of energy.

Appendix 1: Audio compatibility
We understand acceptable audio quality level if the following conditions are applied to the analyzed
audio:
●
●
●
●
●
●

Average loudness varies from -30dB to 0dB
Number of clipped audio samples does not exceed 2%
SNR is at least 20dB for regular calls and at least 24dB in case of using a loudspeaker
Non-speech signal presence is strictly prohibited
Speech tempo is maximum of 250%
Presence of noise reduction and echo compensation algorithms allowed only if comply with
Loudness, clipping and SNR requirements.

Appendix 2: Call quality metrics table
(recommended)
Metric

Units

Max

Min

Critical

Major

Minor

Warning

Excellent

Mean
Opinion
Score
(audio,
Sevana
AQuA)

-

5

1

<2.5

<3

<3.5

<4

>4

Mean
Opinion
Score
(E-Model)

-

5

1

<3

<3.5

<3.7

<4

<4.3

R Factor

%

100

30

<50

<60

<70

<80

>=80

Speech
distortion

%

0

10

>9

>7

>6.5

>5.5

<4.5

Speech
power

dBm

-17.5

-50

<-50 or
>15

<-40 or
>10

<-35 or
>5

<-30.5 or
>0

<-30

DTMF
detection
ratio

%

100

0

<70

<80

<90

<100

Copyright © Sevana, 2013
Sevana Oy
Agricolankatu 11
00530 Helsinki
Finland
Phone: +358 9 2316 4165

Sevana Oü
Rohtlaane 12
76911 Huuru kula
Estonia (Harjumaa)
Phone: +372 53485178

100

Weitere ähnliche Inhalte

Mehr von Sevana Oü

QualTest SIP User guide
QualTest SIP User guideQualTest SIP User guide
QualTest SIP User guideSevana Oü
 
QualTest GSM User Guide
QualTest GSM User GuideQualTest GSM User Guide
QualTest GSM User GuideSevana Oü
 
Sevana QualTest
Sevana QualTestSevana QualTest
Sevana QualTestSevana Oü
 
Sevana real-time rtp analysis for mobile operators
Sevana real-time rtp analysis for mobile operatorsSevana real-time rtp analysis for mobile operators
Sevana real-time rtp analysis for mobile operatorsSevana Oü
 
Sevana AQuA. End-to-end drive testing technology
Sevana AQuA. End-to-end drive testing technologySevana AQuA. End-to-end drive testing technology
Sevana AQuA. End-to-end drive testing technologySevana Oü
 
Real time call quality analysis for mobile operators
Real time call quality analysis for mobile operatorsReal time call quality analysis for mobile operators
Real time call quality analysis for mobile operatorsSevana Oü
 
Sevana QualTest
Sevana QualTestSevana QualTest
Sevana QualTestSevana Oü
 
Sevana PVQA Server
Sevana PVQA ServerSevana PVQA Server
Sevana PVQA ServerSevana Oü
 
Sevana AQuA (Audio Quality Analyzer)
Sevana AQuA (Audio Quality Analyzer)Sevana AQuA (Audio Quality Analyzer)
Sevana AQuA (Audio Quality Analyzer)Sevana Oü
 
Real-time-RTP-analysis
Real-time-RTP-analysisReal-time-RTP-analysis
Real-time-RTP-analysisSevana Oü
 
AQuA 7.x manual
AQuA 7.x manualAQuA 7.x manual
AQuA 7.x manualSevana Oü
 
Drive Testing. AQuA. PVQA.
Drive Testing. AQuA. PVQA.Drive Testing. AQuA. PVQA.
Drive Testing. AQuA. PVQA.Sevana Oü
 
Drive-Testing-AQuA-PVQA
Drive-Testing-AQuA-PVQADrive-Testing-AQuA-PVQA
Drive-Testing-AQuA-PVQASevana Oü
 
AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)
AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)
AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)Sevana Oü
 
AQuA - альтернатива PESQ (p.862) и POLQA (P.863)
AQuA - альтернатива PESQ (p.862) и POLQA (P.863)AQuA - альтернатива PESQ (p.862) и POLQA (P.863)
AQuA - альтернатива PESQ (p.862) и POLQA (P.863)Sevana Oü
 
Sevana Audio Quality Analyzer Brochure
Sevana Audio Quality Analyzer BrochureSevana Audio Quality Analyzer Brochure
Sevana Audio Quality Analyzer BrochureSevana Oü
 
Passive Call Quality Monitoring in VoIP
Passive Call Quality Monitoring in VoIPPassive Call Quality Monitoring in VoIP
Passive Call Quality Monitoring in VoIPSevana Oü
 
Sevana Voice Impairments Detection Library
Sevana Voice Impairments Detection LibrarySevana Voice Impairments Detection Library
Sevana Voice Impairments Detection LibrarySevana Oü
 
Mp3 ogg aac bitrate size quality compression optimization
Mp3 ogg aac bitrate size quality compression optimizationMp3 ogg aac bitrate size quality compression optimization
Mp3 ogg aac bitrate size quality compression optimizationSevana Oü
 

Mehr von Sevana Oü (20)

QualTest SIP User guide
QualTest SIP User guideQualTest SIP User guide
QualTest SIP User guide
 
QualTest GSM User Guide
QualTest GSM User GuideQualTest GSM User Guide
QualTest GSM User Guide
 
Sevana QualTest
Sevana QualTestSevana QualTest
Sevana QualTest
 
Sevana real-time rtp analysis for mobile operators
Sevana real-time rtp analysis for mobile operatorsSevana real-time rtp analysis for mobile operators
Sevana real-time rtp analysis for mobile operators
 
Sevana AQuA. End-to-end drive testing technology
Sevana AQuA. End-to-end drive testing technologySevana AQuA. End-to-end drive testing technology
Sevana AQuA. End-to-end drive testing technology
 
Real time call quality analysis for mobile operators
Real time call quality analysis for mobile operatorsReal time call quality analysis for mobile operators
Real time call quality analysis for mobile operators
 
Sevana QualTest
Sevana QualTestSevana QualTest
Sevana QualTest
 
Sevana PVQA
Sevana PVQASevana PVQA
Sevana PVQA
 
Sevana PVQA Server
Sevana PVQA ServerSevana PVQA Server
Sevana PVQA Server
 
Sevana AQuA (Audio Quality Analyzer)
Sevana AQuA (Audio Quality Analyzer)Sevana AQuA (Audio Quality Analyzer)
Sevana AQuA (Audio Quality Analyzer)
 
Real-time-RTP-analysis
Real-time-RTP-analysisReal-time-RTP-analysis
Real-time-RTP-analysis
 
AQuA 7.x manual
AQuA 7.x manualAQuA 7.x manual
AQuA 7.x manual
 
Drive Testing. AQuA. PVQA.
Drive Testing. AQuA. PVQA.Drive Testing. AQuA. PVQA.
Drive Testing. AQuA. PVQA.
 
Drive-Testing-AQuA-PVQA
Drive-Testing-AQuA-PVQADrive-Testing-AQuA-PVQA
Drive-Testing-AQuA-PVQA
 
AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)
AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)
AQuA - End-to-End Drive Testing Technology (VoLTE, VoWiFi, RCS)
 
AQuA - альтернатива PESQ (p.862) и POLQA (P.863)
AQuA - альтернатива PESQ (p.862) и POLQA (P.863)AQuA - альтернатива PESQ (p.862) и POLQA (P.863)
AQuA - альтернатива PESQ (p.862) и POLQA (P.863)
 
Sevana Audio Quality Analyzer Brochure
Sevana Audio Quality Analyzer BrochureSevana Audio Quality Analyzer Brochure
Sevana Audio Quality Analyzer Brochure
 
Passive Call Quality Monitoring in VoIP
Passive Call Quality Monitoring in VoIPPassive Call Quality Monitoring in VoIP
Passive Call Quality Monitoring in VoIP
 
Sevana Voice Impairments Detection Library
Sevana Voice Impairments Detection LibrarySevana Voice Impairments Detection Library
Sevana Voice Impairments Detection Library
 
Mp3 ogg aac bitrate size quality compression optimization
Mp3 ogg aac bitrate size quality compression optimizationMp3 ogg aac bitrate size quality compression optimization
Mp3 ogg aac bitrate size quality compression optimization
 

Kürzlich hochgeladen

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

Voice Quality Impairments Detection. Recommended call quality metrics for VoIP

  • 1. Voice Quality Impairments Detection Introduction The purpose of this document is to define a vocabulary that can be used to discuss symptoms of voice quality problems detection. This document is intended to be a living resource in that the detection of symptoms listed are expected to be revised as new problems arise and additional information becomes available. Signal-to-noise ratio Signal-to-noise ratio (often abbreviated SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. It is defined as the ratio of signal power to the noise power. A ratio higher than 1:1 indicates more signal than noise. While SNR is commonly quoted for electrical signals, it can be applied to any form of signal (such as isotope levels in an ice core or biochemical signaling between cells). The signal-to-noise ratio, the bandwidth, and the channel capacity of a communication channel are connected by the Shannon–Hartley theorem. Signal-to-noise ratio is sometimes used informally to refer to the ratio of useful information to false or irrelevant data in a conversation or exchange. For example, in online discussion forums and other online communities, off-topic posts and spam are regarded as "noise" that interferes with the "signal" of appropriate discussion. Telecommunication systems strive to increase the ratio of signal level to noise level in order to effectively transmit data. In practice, if the transmitted signal falls below the level of the noise (often designated as the noise floor) in the system, data can no longer be decoded at the receiver. Noise in telecommunication systems is a product of both internal and external sources to the system We recommend SNR to be not lower than 25dB in speech signal. Absolute silence This type of impairment relates to silence between speech whenone cannot recognize whether the other person is still there because there is no sound on the line. A common cause for this problem is Voice Activity Detection (VAD) without comfort noise. In order to experience this symptom, usually the background noise is loud enough for the silence insertion to be noticeable but soft enough so that VAD is engaged. Silence appearing during a phone call is considered an artifact that is associated with connection loss. Therefore one can set energy threshold on a frame and when it goes below threshold value one starts Copyright © Sevana, 2013 Sevana Oy Agricolankatu 11 00530 Helsinki Finland Phone: +358 9 2316 4165 Sevana Oü Rohtlaane 12 76911 Huuru kula Estonia (Harjumaa) Phone: +372 53485178
  • 2. calculating duration of the silent fragment of the signal. If one receives a “loud” frame then the counter for silence is reset to zero. If counter value becomes greater than f.e. 1 second then we can notify about detecting absolute silence impairment and quality loss due to silent fragments.. Loudness This impairment is related to too loud or too silent calls. In case signal energy changes significantly one may consider the usual call quality has been degraded. Most recent version of the library works together with VAD to detect too loud call fragments by calculating average energy values from active signal fragments (frames), and when the average exceed predefined threshold this indicates that the signal is too loud. Amplitude clipping Amplitude clipping impairment or the so called “buzziness” is related to the fact if the signal amplitude is too high at some point along the analog voice path, when the voice signal is converted to a digital form amplitude clipping can occur. Users report that speech may seem excessively loud and potentially "buzzy" or "fuzzy". One can find a sample of amplitude clipped audio at this link: http://www.voiptroubleshooter.com/sound_files/amplitude_clipping.wav In case amount of clipped samples is higher than 2% the audio quality gets considerably lower: 1) To take integral result over a frame one must check dClpLevel and dClpLevelWide. We may consider single clipped frames as non-critical impairments for overall quality, because it may be due to energy normalization only. However, if at the same time we have clipped sequences of samples then we face real clipping impairment. We must alert if dClpLevel > 2% and dClpLevelWide > 0. One can also set a threshold for dClpLevelWide, f.e. 5% from dClpLevel. 2) For real-time monitoring one should check dFrameClpLevel&dFrameClpLevelWide and dFlyClpLevel&dFlyClpLevelWide. If we have clipping on a single frame and there are no clipped samples to the left and to the right from it, this impairment we may identify as a click. Temporary quality degradation is characterized by clipping in longer parts of the signal. These parts one may characterize as significant increase of the input signal loudness. Clicking Clicking impairment is related to a short time period energy increase - click. If clicks appear more often than in 3-5 seconds then we have audio quality degradation, what should cause a clicking alert. Stuck Stuck means appearance of a relatively constant amplitude level of the signal. Stuck signal one percepts as absolute silence, what is not typical for speech. Depending on energy change it may also be percepted as a click. We recommend to set the same threshold for Stuck as for Clicking: not more than 1 stuck impairment during 3-5 seconds. However, if stuck duration is more than 10% of the whole audio this is also a signal of significant quality degradation with stuck impairment. VAD clipping This impairment detects incorrect work of Voice Activity Detector (VAD). Detector finds edges of active and inactive fragments of the signal considering VAD worked too late (in the beginning of the speech) or too early (in the end of the speech). Copyright © Sevana, 2013 Sevana Oy Agricolankatu 11 00530 Helsinki Finland Phone: +358 9 2316 4165 Sevana Oü Rohtlaane 12 76911 Huuru kula Estonia (Harjumaa) Phone: +372 53485178
  • 3. Let us calculate number of changes of VAD (i.e. voice/no voice) and consider that number is X, then the following formular 100 * dNumClpFrames/X calculates a metric, which should not exceed 10% for acceptable speech quality. Echo Signal reflection (echo) occurs when a signal is transmitted along a transmission medium, such as a copper cable or an optical fiber. Some of the signal power may be reflected back to its origin rather than being carried all the way along the cable to the far end. This happens because imperfections in the cable cause impedance mismatches and non-linear changes in the cable characteristics. These abrupt changes in characteristics cause some of the transmitted signal to be reflected. The ratio of energy bounced back depends on the impedance mismatch. Mathematically, it is defined using the reflection coefficient. In telecommunications, the reflection coefficient is the ratio of the amplitude of the reflected wave to the amplitude of the incident wave. In particular, at a discontinuity in a transmission line, it is the complex ratio of the electric field strength of the reflected wave ( ) to that of the incident wave ( ). This is typically represented with a (capital gamma) and can be written as: The reflection coefficient may also be established using other field or circuit quantities. The reflection coefficient can be given by the equations below, where the source, is the impedance toward is the impedance toward the load: Notice that a negative reflection coefficient means that the reflected wave receives a 180°, or phase shift. , The absolute magnitude (designated by vertical bars) of the reflection coefficient can be calculated from the standing wave ratio, SWR: The reflection coefficient range is from -1 to +1 There are two algorithms to detect echo implemented: correlation based and echo compensation based. Selecting one of them is possible during library compilation. In case of echo compensation based algorithm one should compare echo energy versus signal energy and if echo energy is more than 20% from the signal energy then we detect echo presence in the speech signal. One can also check VAD and compare energy values only when VAD is active. Copyright © Sevana, 2013 Sevana Oy Agricolankatu 11 00530 Helsinki Finland Phone: +358 9 2316 4165 Sevana Oü Rohtlaane 12 76911 Huuru kula Estonia (Harjumaa) Phone: +372 53485178
  • 4. In case of correlation based algorithm one should consider similarity. We can say echo is present if correlation is higher than 0.7 and at the same time check signal energy level: in case it’s low then echo is not present (false positive). One can also consider VAD instead of energy. Appendix 1: Audio compatibility We understand acceptable audio quality level if the following conditions are applied to the analyzed audio: ● ● ● ● ● ● Average loudness varies from -30dB to 0dB Number of clipped audio samples does not exceed 2% SNR is at least 20dB for regular calls and at least 24dB in case of using a loudspeaker Non-speech signal presence is strictly prohibited Speech tempo is maximum of 250% Presence of noise reduction and echo compensation algorithms allowed only if comply with Loudness, clipping and SNR requirements. Appendix 2: Call quality metrics table (recommended) Metric Units Max Min Critical Major Minor Warning Excellent Mean Opinion Score (audio, Sevana AQuA) - 5 1 <2.5 <3 <3.5 <4 >4 Mean Opinion Score (E-Model) - 5 1 <3 <3.5 <3.7 <4 <4.3 R Factor % 100 30 <50 <60 <70 <80 >=80 Speech distortion % 0 10 >9 >7 >6.5 >5.5 <4.5 Speech power dBm -17.5 -50 <-50 or >15 <-40 or >10 <-35 or >5 <-30.5 or >0 <-30 DTMF detection ratio % 100 0 <70 <80 <90 <100 Copyright © Sevana, 2013 Sevana Oy Agricolankatu 11 00530 Helsinki Finland Phone: +358 9 2316 4165 Sevana Oü Rohtlaane 12 76911 Huuru kula Estonia (Harjumaa) Phone: +372 53485178 100