SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Automatic Sound Signals Quality
              Estimation

Sevana Oy


Endre Domiczi
Mobile: +372 53485178
E-mail: ceo@sevana.fi
Automatic Sound Signals Quality Estimation

•   General outline and the basic components of the system – signal,
    synchronization and analytical components

•   Test signal (including statistic speech model) and sound perception models

•   Adequacy of analytical estimations based on the results of the comparison
    of received analytical and subjective MOS estimations

•   Acoustic model further developments

•   Available software

•   System applications
Review of existing quality estimation methods
•   AI (Articulation Index). The idea is that the whole frequency range of
    speech signal is divided into 20 bands and the signal/noise ratio is
    determined within the band. The band broad is defined in such a way, that
    every band contributes equally in speech perception. The signal/noise ratio
    is calculated within every band. Articulation index is supposed to be equal
    the weighted total of the band values. The articulation index does not take
    into account the properties of hearing and speech production, although it
    directs toward speech signal.

•   SII (Speech Intelligibility Index) is the evolution of AI method. The
    American Standard ANSI S3.5-1997 includes the speech intelligibility index.
    It provides for 4 measuring procedures on different band groups: 21 critical
    bands, 18 one third-octave bands, 17 equal by their contribution critical
    bands and 6 octave bands. The signal/noise ratio is calculated within every
    band and the total SII coefficient, ranged from 0 to 1 is computed. The
    speech intelligibility index takes into account only the properties of hearing,
    not speech production.
Review of existing quality estimation methods
•   STI (Speech Transmission Index). We may approximately consider speech signal
    as broadband signal modulated by low-frequency signal. Articulation speed
    determines modulation frequency. When modulation depth decreases, speech signal
    becomes similar to noise and its intelligibility decreases. Accordingly, intelligibility
    decrease can be estimated according to modulation depth decrease also. Whole
    speech range is divided into 7octave bands. An octave noise signal is the input. The
    test signal intensity distribution agrees with the distribution of speech signal
    intensities. The modulating signal frequencies vary from 0.5 to 12.5 Hz with one-third-
    octave interval (14 frequencies in all). The STI measuring method is stated in the
    International standard IEC 268-16.

•   RATSI/STIPA (Rapid Speech Transmission Index). The STI method needs a lot of
    measuring procedures and calculations. A simplified method was developed, which
    provides for measuring only in 2 bands with 5 modulation frequencies and reduces the
    number of measuring procedures and calculations. For good intelligibility RASTI
    values must be not less than 0.6.

    Speech transmission index as well as rapid speech transmission index imitates
    speech production process by means of noise model, but to take into account the
    properties of speech production and hearing in such a way is far from optimum.
Review of existing quality estimation methods
•   C50 (factor of clearness) determines sound clearness and clarity. It is
    computed as near echo/far echo ratio. The method is based on the fact, that
    echo reduces signal intelligibility. The near echo/far echo ratios in several
    frequency bands are calculated. They consider near echo (less than 33 ms)
    as useful signal and far echo (more than 33 ms) as disturbing signal. The
    factor of clearness takes into account only the one kind of the possible
    distortions and it is worth to apply it only as one of the speech quality
    estimations.

The need to develop new methods and to improve existing ones is caused by
desire to bring together objective and subjective estimation of quality and to
explicitly use in such systems our knowledge about hearing and speech
production.

To use arbitrary or particularized signal as a source signal depends on the
estimation purpose (speech intelligibility evaluation, sound reproduction quality,
quality estimation of speech, transmitted through intercommunication channels,
etc.) and allows increasing estimation objectivity.
Introduced System General Concept
             Block of signals
                                               Device




                                                             Synchronizer
                                              under test




                                                                                         result
                                                                            Analytical
                                Bank of
                                                                             module
                                signals
               Test signal
               generator                  Estimation block




Generator of test signals allows sound signal forming according to one of the sound flow
models. It can be either a particularized set of sound signals or a signal, received in output
of statistical speech model. Generator’s signal can either be saved for follow-up usage or
be exposed to processing and estimation. Bank of signals stores sound data, received as
a result of signals’ generator work or from some external sources.

Accordingly, an input of estimation block is a signal of generator directly or one of the bank
of signals. Test signal is the input of the synchronizer or of the device under test, which
can be for example, a vocoder or a communication channel. The output signal of the
device under test is an input of synchronizer also.

The synchronizer matches in time an initial signal and a processed signal. The
synchronized signals in chunks input in analytical module, which determines the degree of
similarity for signals and issues the quality estimation as the measure of similarity between
the initial and the processed signals.
Implementation
Algorithms described are implemented in Sevana Audio Codecs Analyzer for
vocoder quality estimation and comparison of external initial signals and signals
under test.

As the external signals arbitrary signals recorded with the sampling frequency of
8 kHz and the capacity of samples equal 16 bits can be used. Supposed, the
signal under test is received from an initial signal as a result of some
transformations (for example, compression/restoration, transmission through
communication channels, filtration).

As internal initial signals (i.e. signals, which the user of the program has no
access to) the signals generated according to the proprietary noise model and
the signals, generated on the basis of the statistic model.

Internal input signals of sound data to the system, are implemented as DLL. One
can use both DLLs provided within Sevana Audio Codecs Analyzer or developed
by the others. The signal processed by means of methods contained in DLL is
consider as the signal under test and is exposed to the proprietary quality
estimation procedure.
Advantages

• It is a universal tool since it allows judging the quality of signals from
  various sources and processed in different ways;

• One can optimize quality estimation signal depending on the
  purposes:
   – in speed (for example, it is possible to receive rough estimation
     quickly);
   – in signal type (using different bands for speech signals and
     sound signals in general);

• Resulting estimations correlate well with that of МОS;

• Quality estimations received for speech signals can be translated in
  values of various kinds of intelligibility.
Test Results

                                                 Noise model
                                                                               Statistic model          PhRT
     Codec            MOS         Minimal         Reduced          Complete

                              -       Vc     -          Vc     -        Vc     -        Vc       -         Vc


     A-Law            4,10    4,79    4,73   4,78       4,78   4,78     4,78   4,79     4,80     4,80      4,84



     Mu-Law           4,10    4,79    4,84   4,77       4,77   4,77     4,78   4,78     4,79     4,79      4,82



     G.723.6.3        3,90    4,25    4,48   4,21       4,29   4,22     4,33   4,15     4,04     4,08      3,95



     GSM.6.10         3,70    3,20    1,99   3,01       1,65   3,04     1,78   4,22     3,66     4,01      3,21



     G.723.5.3        3,65    4,23    4,44   4,18       4,27   4,19     4,32   4,14     4,04     4,06      3,93




The table above represents quality estimations of several standard vocoders, received on various test signals
using the proprietary method and Sevana Audio Codecs Analyzer. The table contains MOS estimations for
comparison.
Estimations under the assumption, that bands are of equal probability, are in the column with «-» symbol and
the estimation received under taking into account the coefficients of importance are in the column with «Vc»
Applications: quality estimation of sound transmission
       through telephone network of general use
                                                               Telephone network
                                                                 of general use
                                                                                                    Reception


                                            Transmission




                                Telephone                                                                       Telephone
                                or modem                                                                        or modem


                                                                                    Signal under test
                                              Initial signal




                                                                Server of sound
                                                               quality estimation



The picture represents applying the method described above for quality estimation of sound transmission through telephone network
of general use. The given scheme is applicable both for local and for long-distance connections.

Server of sound quality estimation generates an initial signal (or chooses among signals prepared before) and transfers it to one of
the telephone subscribers taking part in the testing. The subscriber received the signal establishes a standard connection with the
second subscriber and reproduced the initial signal. The second subscriber records the receiving sound signal and transfers it to the
sound quality estimation server.

The sound quality estimation server compares the initial and the test signals according to the suggested method and gives the
quality estimation of the signal transferred through the telephone network. The received estimation can be used for improving the
subscriber service, deciding about the necessity of equipment changing or setting (both on the side of the subscriber and on the
station), as advertising and others.
Applications: quality estimation of sound transferred
                  through IP-network


                                                                     IP-network
                                                               ion                        So
                                                             ss                             un
                                                           mi                                  d
                                                        ns                                         sig
                                                     tra                                              na
                                                al                                                      l re
                                             ign                                                            ce
                                                                                                              pti
                                           ls                                                                    on
                                       itia
                                     In


                     VoIP-Server 1                                                                                    VoIP-Server 2




                                         Initial signal                                  Signal under test




                                                                       Sound quality
                                                                     estimation server




Similarly to telephone network quality estimation of sound transferred through IP-network
is performed. It differs from the previous application in the way of transferring the initial and
the signal under test from the sound quality estimation server to subscribers and in the
way of data transfer between subscribers.

Quality estimations received can be used for choosing codecs used in VoIP-connection
and when choosing operators, providing IP-telephony services.
Applications: sound quality estimation of cellular and
                 satellite connection
                                                                    Satellite com m unication




                                            Initial signal
                                           transm ission                                                  Initial signal
                                                                                                            reception
                                                                                      C ellular network




                                                                                                                              Sm artphone or
                Sm artphone or
                                                                                                                              m obile phone
                m obile phone




                                 Initial signal                                                           Signal under test




                                                             Sound quality estim ation srever




The introduced method and software can be effectively used for quality
estimation of cellular and satellite connections. Received estimations
subscribers use for choosing operators and telephone models and operators use
for optimization of base station locations.
Applications: quality estimation of systems and
     algorithms (methods) for sound data compression
                                                      Workstations of developers and testers
                                                      of systems of sound data compression




                                                 Initial and test signals,
                                                   quality estimations




                                                                      Sound data base




                                                                                                    Signals
                                   Initial signals
                                                                                                   under test
                                                                             Quality estimations




                                                            Sound quality estimation server




Every codec version (or codec with a set of parameters) requires estimation and comparison with the
analogs. Every developer can refer to sound sample base, compress and restore a signal and receive
objective quality estimation of the codec work.

Such a system allows managing the codec developing process and the optimization of their parameters
much more effectively. Ultimate consumer will be able to receive not just functioning, but optimal
algorithm.
Applications: rooms’ sound quality estimation
                                                         Distribution system
                                                                                         Microphone

                                                                                                               Speaker




                                                                                                          al
                                                                                                      ign
                                                                                                    s
                                                                                                ial
                                                                                              Init
                                 Sound system 1                  Sound system
                                                  Signals under test




                                                       Sound quality estimation server



In this case initial signal is a signal from the microphone located opposite the speaker and
signals under test are those from microphones located in different parts of the room, in
places where hearers and sound reproducing equipment are located.

The received estimations can be used for optimization of the location of sound reproducing
equipment, furniture and spectators’ places.
Further Development
•   Integration with existing Quality of Service and Quality of Experience systems to increase their
    functionality and enrich test impact.

•   Test signal model improvement. Here the noise model can be supplied with a set of multiband
    modulated noise signals; enrich the set of data and algorithms of the statistic speech model,
    increase the number of prepared test signals (such as records of PhRT);

•   Development of more upgraded algorithms of synchronization, based, for example, on
    coincidence of maximums in signal energy spectrums;

•   Acoustic model modernization with taking into account masking effects and the fact that pure
    tones and band noise cause difference in hearing;
•
•   Signal comparison scheme modernization. Current distance measure can be more accurate for
    strongly different signals. For higher universality of the system it is desired to use the correlation
    analysis methods for comparison;

•   Solve a number of practical problems the systems requires the possibility to work with
    multichannel (Stereo-, Quadro-, etc.) and to receive immediate quality estimations;

•   Absolutely correct translation of the objective estimations into MOS estimation values requires
    further experimental researches.
THANK YOU!

Weitere ähnliche Inhalte

Was ist angesagt?

Course Content of 6ET3 Digital communication
Course Content of 6ET3 Digital communicationCourse Content of 6ET3 Digital communication
Course Content of 6ET3 Digital communicationDr. Sanjay M. Gulhane
 
La importancia de los modelos y procedimientos para la planificación, seguimi...
La importancia de los modelos y procedimientos para la planificación, seguimi...La importancia de los modelos y procedimientos para la planificación, seguimi...
La importancia de los modelos y procedimientos para la planificación, seguimi...Ministerio TIC Colombia
 
Book wiegandschwarz
Book wiegandschwarzBook wiegandschwarz
Book wiegandschwarzdavid s
 
Multirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolationMultirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolationransherraj
 
Gsm rf-optimization
Gsm rf-optimizationGsm rf-optimization
Gsm rf-optimizationkarimfeel
 
Paper id 2720144
Paper id 2720144Paper id 2720144
Paper id 2720144IJRAT
 
Network optimization presentation generic dec18
Network optimization presentation generic dec18Network optimization presentation generic dec18
Network optimization presentation generic dec18frankjoh
 

Was ist angesagt? (11)

Course Content of 6ET3 Digital communication
Course Content of 6ET3 Digital communicationCourse Content of 6ET3 Digital communication
Course Content of 6ET3 Digital communication
 
Speech coding std
Speech coding stdSpeech coding std
Speech coding std
 
Speech coding techniques
Speech coding techniquesSpeech coding techniques
Speech coding techniques
 
La importancia de los modelos y procedimientos para la planificación, seguimi...
La importancia de los modelos y procedimientos para la planificación, seguimi...La importancia de los modelos y procedimientos para la planificación, seguimi...
La importancia de los modelos y procedimientos para la planificación, seguimi...
 
Book wiegandschwarz
Book wiegandschwarzBook wiegandschwarz
Book wiegandschwarz
 
A508
A508A508
A508
 
Multirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolationMultirate signal processing and decimation interpolation
Multirate signal processing and decimation interpolation
 
seminar4
seminar4seminar4
seminar4
 
Gsm rf-optimization
Gsm rf-optimizationGsm rf-optimization
Gsm rf-optimization
 
Paper id 2720144
Paper id 2720144Paper id 2720144
Paper id 2720144
 
Network optimization presentation generic dec18
Network optimization presentation generic dec18Network optimization presentation generic dec18
Network optimization presentation generic dec18
 

Andere mochten auch

Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...
Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...
Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...Epic UGM nl
 
Speech on IC Clear at Plain2013 conference
Speech on IC Clear at Plain2013 conferenceSpeech on IC Clear at Plain2013 conference
Speech on IC Clear at Plain2013 conferenceKlaartje Eu
 
Trainings index way to use the benchmarking
Trainings index way to use the benchmarking Trainings index way to use the benchmarking
Trainings index way to use the benchmarking Amplua Broker
 
IESBGA Social Media Seminar
IESBGA Social Media SeminarIESBGA Social Media Seminar
IESBGA Social Media SeminarCarl Catedral
 
Boletín de Prensa - 154 Años de Provincialización de Los Ríos
Boletín de Prensa - 154 Años de Provincialización de Los RíosBoletín de Prensa - 154 Años de Provincialización de Los Ríos
Boletín de Prensa - 154 Años de Provincialización de Los RíosMarcela Aguiñaga
 

Andere mochten auch (8)

Assignmen 24[1]
Assignmen 24[1]Assignmen 24[1]
Assignmen 24[1]
 
Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...
Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...
Sessie 10 Een modern en veilig medicatieproces Remco de Jong en Hein van Onze...
 
Q2 powerpoint
Q2 powerpointQ2 powerpoint
Q2 powerpoint
 
Speech on IC Clear at Plain2013 conference
Speech on IC Clear at Plain2013 conferenceSpeech on IC Clear at Plain2013 conference
Speech on IC Clear at Plain2013 conference
 
Location Map for Adopt Program
Location Map for Adopt ProgramLocation Map for Adopt Program
Location Map for Adopt Program
 
Trainings index way to use the benchmarking
Trainings index way to use the benchmarking Trainings index way to use the benchmarking
Trainings index way to use the benchmarking
 
IESBGA Social Media Seminar
IESBGA Social Media SeminarIESBGA Social Media Seminar
IESBGA Social Media Seminar
 
Boletín de Prensa - 154 Años de Provincialización de Los Ríos
Boletín de Prensa - 154 Años de Provincialización de Los RíosBoletín de Prensa - 154 Años de Provincialización de Los Ríos
Boletín de Prensa - 154 Años de Provincialización de Los Ríos
 

Ähnlich wie Automatic Sound Signals Quality Estimation

IRJET- Segmentation in Digital Signal Processing
IRJET-  	  Segmentation in Digital Signal ProcessingIRJET-  	  Segmentation in Digital Signal Processing
IRJET- Segmentation in Digital Signal ProcessingIRJET Journal
 
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...IRJET Journal
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
IRJET- Emotion recognition using Speech Signal: A Review
IRJET-  	  Emotion recognition using Speech Signal: A ReviewIRJET-  	  Emotion recognition using Speech Signal: A Review
IRJET- Emotion recognition using Speech Signal: A ReviewIRJET Journal
 
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based ModelReal-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Modeladil raja
 
55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-success55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-successCao Xuân Trình
 
55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-success55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-successCao Xuân Trình
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...ijceronline
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement techniqueeSAT Publishing House
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Describe The Main Functions Of Each Layer In The Osi Model...
Describe The Main Functions Of Each Layer In The Osi Model...Describe The Main Functions Of Each Layer In The Osi Model...
Describe The Main Functions Of Each Layer In The Osi Model...Amanda Brady
 
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET Journal
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
Rion products for_the_automotive_industry_1705-2_denkei
Rion products for_the_automotive_industry_1705-2_denkeiRion products for_the_automotive_industry_1705-2_denkei
Rion products for_the_automotive_industry_1705-2_denkeiNIHON DENKEI SINGAPORE
 
Rion product for_the_automotive_industry
Rion product for_the_automotive_industry Rion product for_the_automotive_industry
Rion product for_the_automotive_industry NIHON DENKEI SINGAPORE
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summaryAditya Deshmukh
 
An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...eSAT Journals
 

Ähnlich wie Automatic Sound Signals Quality Estimation (20)

IRJET- Segmentation in Digital Signal Processing
IRJET-  	  Segmentation in Digital Signal ProcessingIRJET-  	  Segmentation in Digital Signal Processing
IRJET- Segmentation in Digital Signal Processing
 
Sqi analyisis
Sqi analyisisSqi analyisis
Sqi analyisis
 
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
IRJET- Emotion recognition using Speech Signal: A Review
IRJET-  	  Emotion recognition using Speech Signal: A ReviewIRJET-  	  Emotion recognition using Speech Signal: A Review
IRJET- Emotion recognition using Speech Signal: A Review
 
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based ModelReal-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
Real-Time Non-Intrusive Speech Quality Estimation: A Signal-Based Model
 
55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-success55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-success
 
55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-success55 w 60126-0-tech-brief_keys-to-coherent-acq-success
55 w 60126-0-tech-brief_keys-to-coherent-acq-success
 
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...
 
Railways
RailwaysRailways
Railways
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Describe The Main Functions Of Each Layer In The Osi Model...
Describe The Main Functions Of Each Layer In The Osi Model...Describe The Main Functions Of Each Layer In The Osi Model...
Describe The Main Functions Of Each Layer In The Osi Model...
 
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
Rion products for_the_automotive_industry_1705-2_denkei
Rion products for_the_automotive_industry_1705-2_denkeiRion products for_the_automotive_industry_1705-2_denkei
Rion products for_the_automotive_industry_1705-2_denkei
 
Rion product for_the_automotive_industry
Rion product for_the_automotive_industry Rion product for_the_automotive_industry
Rion product for_the_automotive_industry
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summary
 
An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...An effective evaluation study of objective measures using spectral subtractiv...
An effective evaluation study of objective measures using spectral subtractiv...
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Kürzlich hochgeladen (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

Automatic Sound Signals Quality Estimation

  • 1. Automatic Sound Signals Quality Estimation Sevana Oy Endre Domiczi Mobile: +372 53485178 E-mail: ceo@sevana.fi
  • 2. Automatic Sound Signals Quality Estimation • General outline and the basic components of the system – signal, synchronization and analytical components • Test signal (including statistic speech model) and sound perception models • Adequacy of analytical estimations based on the results of the comparison of received analytical and subjective MOS estimations • Acoustic model further developments • Available software • System applications
  • 3. Review of existing quality estimation methods • AI (Articulation Index). The idea is that the whole frequency range of speech signal is divided into 20 bands and the signal/noise ratio is determined within the band. The band broad is defined in such a way, that every band contributes equally in speech perception. The signal/noise ratio is calculated within every band. Articulation index is supposed to be equal the weighted total of the band values. The articulation index does not take into account the properties of hearing and speech production, although it directs toward speech signal. • SII (Speech Intelligibility Index) is the evolution of AI method. The American Standard ANSI S3.5-1997 includes the speech intelligibility index. It provides for 4 measuring procedures on different band groups: 21 critical bands, 18 one third-octave bands, 17 equal by their contribution critical bands and 6 octave bands. The signal/noise ratio is calculated within every band and the total SII coefficient, ranged from 0 to 1 is computed. The speech intelligibility index takes into account only the properties of hearing, not speech production.
  • 4. Review of existing quality estimation methods • STI (Speech Transmission Index). We may approximately consider speech signal as broadband signal modulated by low-frequency signal. Articulation speed determines modulation frequency. When modulation depth decreases, speech signal becomes similar to noise and its intelligibility decreases. Accordingly, intelligibility decrease can be estimated according to modulation depth decrease also. Whole speech range is divided into 7octave bands. An octave noise signal is the input. The test signal intensity distribution agrees with the distribution of speech signal intensities. The modulating signal frequencies vary from 0.5 to 12.5 Hz with one-third- octave interval (14 frequencies in all). The STI measuring method is stated in the International standard IEC 268-16. • RATSI/STIPA (Rapid Speech Transmission Index). The STI method needs a lot of measuring procedures and calculations. A simplified method was developed, which provides for measuring only in 2 bands with 5 modulation frequencies and reduces the number of measuring procedures and calculations. For good intelligibility RASTI values must be not less than 0.6. Speech transmission index as well as rapid speech transmission index imitates speech production process by means of noise model, but to take into account the properties of speech production and hearing in such a way is far from optimum.
  • 5. Review of existing quality estimation methods • C50 (factor of clearness) determines sound clearness and clarity. It is computed as near echo/far echo ratio. The method is based on the fact, that echo reduces signal intelligibility. The near echo/far echo ratios in several frequency bands are calculated. They consider near echo (less than 33 ms) as useful signal and far echo (more than 33 ms) as disturbing signal. The factor of clearness takes into account only the one kind of the possible distortions and it is worth to apply it only as one of the speech quality estimations. The need to develop new methods and to improve existing ones is caused by desire to bring together objective and subjective estimation of quality and to explicitly use in such systems our knowledge about hearing and speech production. To use arbitrary or particularized signal as a source signal depends on the estimation purpose (speech intelligibility evaluation, sound reproduction quality, quality estimation of speech, transmitted through intercommunication channels, etc.) and allows increasing estimation objectivity.
  • 6. Introduced System General Concept Block of signals Device Synchronizer under test result Analytical Bank of module signals Test signal generator Estimation block Generator of test signals allows sound signal forming according to one of the sound flow models. It can be either a particularized set of sound signals or a signal, received in output of statistical speech model. Generator’s signal can either be saved for follow-up usage or be exposed to processing and estimation. Bank of signals stores sound data, received as a result of signals’ generator work or from some external sources. Accordingly, an input of estimation block is a signal of generator directly or one of the bank of signals. Test signal is the input of the synchronizer or of the device under test, which can be for example, a vocoder or a communication channel. The output signal of the device under test is an input of synchronizer also. The synchronizer matches in time an initial signal and a processed signal. The synchronized signals in chunks input in analytical module, which determines the degree of similarity for signals and issues the quality estimation as the measure of similarity between the initial and the processed signals.
  • 7. Implementation Algorithms described are implemented in Sevana Audio Codecs Analyzer for vocoder quality estimation and comparison of external initial signals and signals under test. As the external signals arbitrary signals recorded with the sampling frequency of 8 kHz and the capacity of samples equal 16 bits can be used. Supposed, the signal under test is received from an initial signal as a result of some transformations (for example, compression/restoration, transmission through communication channels, filtration). As internal initial signals (i.e. signals, which the user of the program has no access to) the signals generated according to the proprietary noise model and the signals, generated on the basis of the statistic model. Internal input signals of sound data to the system, are implemented as DLL. One can use both DLLs provided within Sevana Audio Codecs Analyzer or developed by the others. The signal processed by means of methods contained in DLL is consider as the signal under test and is exposed to the proprietary quality estimation procedure.
  • 8. Advantages • It is a universal tool since it allows judging the quality of signals from various sources and processed in different ways; • One can optimize quality estimation signal depending on the purposes: – in speed (for example, it is possible to receive rough estimation quickly); – in signal type (using different bands for speech signals and sound signals in general); • Resulting estimations correlate well with that of МОS; • Quality estimations received for speech signals can be translated in values of various kinds of intelligibility.
  • 9. Test Results Noise model Statistic model PhRT Codec MOS Minimal Reduced Complete - Vc - Vc - Vc - Vc - Vc A-Law 4,10 4,79 4,73 4,78 4,78 4,78 4,78 4,79 4,80 4,80 4,84 Mu-Law 4,10 4,79 4,84 4,77 4,77 4,77 4,78 4,78 4,79 4,79 4,82 G.723.6.3 3,90 4,25 4,48 4,21 4,29 4,22 4,33 4,15 4,04 4,08 3,95 GSM.6.10 3,70 3,20 1,99 3,01 1,65 3,04 1,78 4,22 3,66 4,01 3,21 G.723.5.3 3,65 4,23 4,44 4,18 4,27 4,19 4,32 4,14 4,04 4,06 3,93 The table above represents quality estimations of several standard vocoders, received on various test signals using the proprietary method and Sevana Audio Codecs Analyzer. The table contains MOS estimations for comparison. Estimations under the assumption, that bands are of equal probability, are in the column with «-» symbol and the estimation received under taking into account the coefficients of importance are in the column with «Vc»
  • 10. Applications: quality estimation of sound transmission through telephone network of general use Telephone network of general use Reception Transmission Telephone Telephone or modem or modem Signal under test Initial signal Server of sound quality estimation The picture represents applying the method described above for quality estimation of sound transmission through telephone network of general use. The given scheme is applicable both for local and for long-distance connections. Server of sound quality estimation generates an initial signal (or chooses among signals prepared before) and transfers it to one of the telephone subscribers taking part in the testing. The subscriber received the signal establishes a standard connection with the second subscriber and reproduced the initial signal. The second subscriber records the receiving sound signal and transfers it to the sound quality estimation server. The sound quality estimation server compares the initial and the test signals according to the suggested method and gives the quality estimation of the signal transferred through the telephone network. The received estimation can be used for improving the subscriber service, deciding about the necessity of equipment changing or setting (both on the side of the subscriber and on the station), as advertising and others.
  • 11. Applications: quality estimation of sound transferred through IP-network IP-network ion So ss un mi d ns sig tra na al l re ign ce pti ls on itia In VoIP-Server 1 VoIP-Server 2 Initial signal Signal under test Sound quality estimation server Similarly to telephone network quality estimation of sound transferred through IP-network is performed. It differs from the previous application in the way of transferring the initial and the signal under test from the sound quality estimation server to subscribers and in the way of data transfer between subscribers. Quality estimations received can be used for choosing codecs used in VoIP-connection and when choosing operators, providing IP-telephony services.
  • 12. Applications: sound quality estimation of cellular and satellite connection Satellite com m unication Initial signal transm ission Initial signal reception C ellular network Sm artphone or Sm artphone or m obile phone m obile phone Initial signal Signal under test Sound quality estim ation srever The introduced method and software can be effectively used for quality estimation of cellular and satellite connections. Received estimations subscribers use for choosing operators and telephone models and operators use for optimization of base station locations.
  • 13. Applications: quality estimation of systems and algorithms (methods) for sound data compression Workstations of developers and testers of systems of sound data compression Initial and test signals, quality estimations Sound data base Signals Initial signals under test Quality estimations Sound quality estimation server Every codec version (or codec with a set of parameters) requires estimation and comparison with the analogs. Every developer can refer to sound sample base, compress and restore a signal and receive objective quality estimation of the codec work. Such a system allows managing the codec developing process and the optimization of their parameters much more effectively. Ultimate consumer will be able to receive not just functioning, but optimal algorithm.
  • 14. Applications: rooms’ sound quality estimation Distribution system Microphone Speaker al ign s ial Init Sound system 1 Sound system Signals under test Sound quality estimation server In this case initial signal is a signal from the microphone located opposite the speaker and signals under test are those from microphones located in different parts of the room, in places where hearers and sound reproducing equipment are located. The received estimations can be used for optimization of the location of sound reproducing equipment, furniture and spectators’ places.
  • 15. Further Development • Integration with existing Quality of Service and Quality of Experience systems to increase their functionality and enrich test impact. • Test signal model improvement. Here the noise model can be supplied with a set of multiband modulated noise signals; enrich the set of data and algorithms of the statistic speech model, increase the number of prepared test signals (such as records of PhRT); • Development of more upgraded algorithms of synchronization, based, for example, on coincidence of maximums in signal energy spectrums; • Acoustic model modernization with taking into account masking effects and the fact that pure tones and band noise cause difference in hearing; • • Signal comparison scheme modernization. Current distance measure can be more accurate for strongly different signals. For higher universality of the system it is desired to use the correlation analysis methods for comparison; • Solve a number of practical problems the systems requires the possibility to work with multichannel (Stereo-, Quadro-, etc.) and to receive immediate quality estimations; • Absolutely correct translation of the objective estimations into MOS estimation values requires further experimental researches.