SlideShare a Scribd company logo
1 of 24
1
Audio Compression
Techniques
Prepared by
Razia Nisar Noorani
Lecture 8
2
Introduction
 Digital Audio Compression
 Removal of redundant or otherwise irrelevant
information from audio signal
 Audio compression algorithms are often referred to as
“audio encoders”
 Applications
 Reduces required storage space
 Reduces required transmission bandwidth
3
Audio Compression
 Audio signal – overview
 Sampling rate (# of samples per second)
 Bit rate (# of bits per second). Typically,
uncompressed stereo 16-bit 44.1KHz signal has a
1.4MBps bit rate
 Number of channels (mono / stereo / multichannel)
 Reduction by lowering those values or by data
compression / encoding
4
Audio Data Compression
 Redundant information
 Implicit in the remaining information
 Ex. oversampled audio signal
 oversampling is the process of sampling a signal with a
sampling frequency significantly higher than twice the
bandwidth or highest frequency of the signal being sampled
 Irrelevant information
 Perceptually insignificant
 Cannot be recovered from remaining information
5
Audio Data Compression
 Lossless Audio Compression
Removes redundant data
Resulting signal is same as original – perfect
reconstruction
 Lossy Audio Encoding
Removes irrelevant data
Resulting signal is similar to original
6
Audio Data Compression
 Audio vs. Speech Compression
Techniques
Speech Compression uses a human vocal
tract model to compress signals
Audio Compression does not use this
technique due to larger variety of possible
signal variations
7
Generic Audio Encoder
 Psychoacoustic Model
Psychoacoustics – study of how sounds are
perceived by humans
Uses perceptual coding
 eliminate information from audio signal that is
inaudible to the ear
Detects conditions under which different audio
signal components mask each other
8
Psychoacoustic Model
 Signal Masking
Threshold cut-off
Spectral (Frequency / Simultaneous) Masking
Temporal Masking
 Threshold cut-off and spectral masking
occur in frequency domain, temporal
masking occurs in time domain
9
Signal Masking
 Threshold cut-off
 Hearing threshold level
– a function of
frequency
 Any frequency
components below the
threshold will not be
perceived by human
ear
10
Signal Masking
 Spectral Masking
 A frequency
component can be
partly or fully masked
by another component
that is close to it in
frequency
 This shifts the hearing
threshold
11
Signal Masking
 Temporal Masking
 A quieter sound can
be masked by a louder
sound if they are
temporally close
 Sounds that occur
both (shortly) before
and after volume
increase can be
masked
12
Spectral Analysis
 a device or algorithm that identifies a frequency
domain representation of a time domain signal.
 Tasks of Spectral Analysis
 To derive masking thresholds to determine which
signal components can be eliminated
 To generate a representation of the signal to which
masking thresholds can be applied
 Spectral Analysis is done through transforms or
filter banks
13
Spectral Analysis
 Transforms
Fast Fourier Transform (FFT)
Discrete Cosine Transform (DCT) - similar to
FFT but uses cosine values only
Modified Discrete Cosine Transform (MDCT)
[used by MPEG-1 Layer-III, MPEG-2 AAC,
Dolby AC-3] – overlapped and windowed
version of DCT
14
Spectral Analysis
 Filter Banks
 a filter bank is an array of band-pass filters that
separates the input signal into multiple
components, each one carrying a single
frequency subband of the original signal
 Time sample blocks are passed through a set of
bandpass filters
 Masking thresholds are applied to resulting frequency
subband signals
 Poly-phase and wavelet banks are most popular filter
structures
15
Filter Bank Structures
 Polyphase Filter Bank
[used in all of the MPEG-1 encoders]
Signal is separated into subbands, the widths
of which are equal over the entire frequency
range
The resulting subband signals are
downsampled to create shorter signals (which
are later reconstructed during decoding
process)
16
Filter Bank Structures
 Wavelet Filter Bank
[used by Enhanced Perceptual Audio
Coder (EPAC) by Lucent]
Unlike polyphase filter, the widths of the
subbands are not evenly spaced (narrower for
higher frequencies)
This allows for better time resolution (ex. short
attacks), but at expense of frequency
resolution
17
Noise Allocation
 System Task: derive and apply shifted hearing
threshold to the input signal
 Anything below the threshold doesn’t need to be
transmitted
 Any noise below the threshold is irrelevant
 Frequency component quantization
 Tradeoff between space and noise
 Encoder saves on space by using just enough bits for
each frequency component to keep noise under the
threshold - this is known as noise allocation
18
Noise Allocation
 Pre-echo
 In case a single audio block contains silence followed
by a loud attack, pre-echo error occurs - there will be
audible noise in the silent part of the block after
decoding
 This is avoided by pre-monitoring audio data at
encoding stage and separating audio into shorter
blocks in potential pre-echo case
 This does not completely eliminate pre-echo, but can
make it short enough to be masked by the attack
(temporal masking)
19
Additional Encoding Techniques
 Other encoding techniques techniques are
available (alternative or in combination)
Predictive Coding
Coupling / Delta Encoding
Huffman Encoding
20
Additional Encoding Techniques
 Predictive Coding
 Often used in speech and image compression
 Estimates the expected value for each sample based
on previous sample values
 Transmits/stores the difference between the expected
and received value
 Generates an estimate for the next sample and then
adjusts it by the difference stored for the current
sample
 Used for additional compression in MPEG2 AAC
(Advance audio Coding)
21
Additional Encoding Techniques
 Coupling / Delta encoding
 Used in cases where audio signal consists of two or
more channels (stereo or surround sound)
 Similarities between channels are used for
compression
 A sum and difference between two channels are
derived; difference is usually some value close to zero
and therefore requires less space to encode
 This is a case of lossless encoding process
22
Additional Encoding Techniques
 Huffman Coding
 Information-theory-based technique
 An element of a signal that often reoccurs in the
signal is represented by a simpler symbol, and its
value is stored in a look-up table
 Implemented using a look-up tables in encoder and in
decoder
 Provides substantial lossless compression, but
requires high computational power and therefore is
not very popular
 Used by MPEG1 and MPEG2 AAC
23
Encoding - Final Stages
 Audio data packed into frames
 Frames stored or transmitted
24
Questions

More Related Content

What's hot (20)

video compression techique
video compression techiquevideo compression techique
video compression techique
 
Data compression
Data compressionData compression
Data compression
 
Audio compression
Audio compressionAudio compression
Audio compression
 
Data compression
Data compressionData compression
Data compression
 
Data Redundacy
Data RedundacyData Redundacy
Data Redundacy
 
Data compression
Data  compressionData  compression
Data compression
 
Audio file format
Audio file formatAudio file format
Audio file format
 
multimedia data and file format
multimedia data and file formatmultimedia data and file format
multimedia data and file format
 
Fundamentals of Data compression
Fundamentals of Data compressionFundamentals of Data compression
Fundamentals of Data compression
 
Text compression
Text compressionText compression
Text compression
 
Image compression
Image compression Image compression
Image compression
 
Jpeg
JpegJpeg
Jpeg
 
Data Compression (Lossy and Lossless)
Data Compression (Lossy and Lossless)Data Compression (Lossy and Lossless)
Data Compression (Lossy and Lossless)
 
Audio and Video Compression
Audio and Video CompressionAudio and Video Compression
Audio and Video Compression
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman Coding
 
Interpixel redundancy
Interpixel redundancyInterpixel redundancy
Interpixel redundancy
 
Mp3
Mp3Mp3
Mp3
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compression
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Jpeg compression
Jpeg compressionJpeg compression
Jpeg compression
 

Similar to Audio compression 1

Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compressionMr SMAK
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMirjes
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMIJRES Journal
 
Audio_Overview.pptx
Audio_Overview.pptxAudio_Overview.pptx
Audio_Overview.pptxBinhHoang71
 
Multimedia Compression and Communication
Multimedia Compression and CommunicationMultimedia Compression and Communication
Multimedia Compression and CommunicationBenesh Selvanesan
 
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm PortfolioPHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm PortfolioHTCS LLC
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptVasanthiMuniasamy2
 
NTSC Software Decoding Presentation
NTSC Software Decoding PresentationNTSC Software Decoding Presentation
NTSC Software Decoding PresentationPrateek Dayal
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audioinventy
 
Development of a Multipurpose Audio Transmission System on the Internet
Development of a Multipurpose Audio Transmission System on the InternetDevelopment of a Multipurpose Audio Transmission System on the Internet
Development of a Multipurpose Audio Transmission System on the InternetTakashi Kishida
 
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANMETHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANIJNSA Journal
 

Similar to Audio compression 1 (20)

Speech Compression
Speech CompressionSpeech Compression
Speech Compression
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compression
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEM
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEM
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Sub band project
Sub band projectSub band project
Sub band project
 
Audio_Overview.pptx
Audio_Overview.pptxAudio_Overview.pptx
Audio_Overview.pptx
 
Multimedia Compression and Communication
Multimedia Compression and CommunicationMultimedia Compression and Communication
Multimedia Compression and Communication
 
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm PortfolioPHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.ppt
 
Digital audio
Digital audioDigital audio
Digital audio
 
M1L1-2.ppt
M1L1-2.pptM1L1-2.ppt
M1L1-2.ppt
 
Mk3422222228
Mk3422222228Mk3422222228
Mk3422222228
 
Digital audio
Digital audioDigital audio
Digital audio
 
MPEG/Audio Compression
MPEG/Audio CompressionMPEG/Audio Compression
MPEG/Audio Compression
 
Speaker Segmentation (2006)
Speaker Segmentation (2006)Speaker Segmentation (2006)
Speaker Segmentation (2006)
 
NTSC Software Decoding Presentation
NTSC Software Decoding PresentationNTSC Software Decoding Presentation
NTSC Software Decoding Presentation
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
 
Development of a Multipurpose Audio Transmission System on the Internet
Development of a Multipurpose Audio Transmission System on the InternetDevelopment of a Multipurpose Audio Transmission System on the Internet
Development of a Multipurpose Audio Transmission System on the Internet
 
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANMETHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
 

Recently uploaded

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniquesugginaramesh
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 

Recently uploaded (20)

Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 

Audio compression 1

  • 2. 2 Introduction  Digital Audio Compression  Removal of redundant or otherwise irrelevant information from audio signal  Audio compression algorithms are often referred to as “audio encoders”  Applications  Reduces required storage space  Reduces required transmission bandwidth
  • 3. 3 Audio Compression  Audio signal – overview  Sampling rate (# of samples per second)  Bit rate (# of bits per second). Typically, uncompressed stereo 16-bit 44.1KHz signal has a 1.4MBps bit rate  Number of channels (mono / stereo / multichannel)  Reduction by lowering those values or by data compression / encoding
  • 4. 4 Audio Data Compression  Redundant information  Implicit in the remaining information  Ex. oversampled audio signal  oversampling is the process of sampling a signal with a sampling frequency significantly higher than twice the bandwidth or highest frequency of the signal being sampled  Irrelevant information  Perceptually insignificant  Cannot be recovered from remaining information
  • 5. 5 Audio Data Compression  Lossless Audio Compression Removes redundant data Resulting signal is same as original – perfect reconstruction  Lossy Audio Encoding Removes irrelevant data Resulting signal is similar to original
  • 6. 6 Audio Data Compression  Audio vs. Speech Compression Techniques Speech Compression uses a human vocal tract model to compress signals Audio Compression does not use this technique due to larger variety of possible signal variations
  • 7. 7 Generic Audio Encoder  Psychoacoustic Model Psychoacoustics – study of how sounds are perceived by humans Uses perceptual coding  eliminate information from audio signal that is inaudible to the ear Detects conditions under which different audio signal components mask each other
  • 8. 8 Psychoacoustic Model  Signal Masking Threshold cut-off Spectral (Frequency / Simultaneous) Masking Temporal Masking  Threshold cut-off and spectral masking occur in frequency domain, temporal masking occurs in time domain
  • 9. 9 Signal Masking  Threshold cut-off  Hearing threshold level – a function of frequency  Any frequency components below the threshold will not be perceived by human ear
  • 10. 10 Signal Masking  Spectral Masking  A frequency component can be partly or fully masked by another component that is close to it in frequency  This shifts the hearing threshold
  • 11. 11 Signal Masking  Temporal Masking  A quieter sound can be masked by a louder sound if they are temporally close  Sounds that occur both (shortly) before and after volume increase can be masked
  • 12. 12 Spectral Analysis  a device or algorithm that identifies a frequency domain representation of a time domain signal.  Tasks of Spectral Analysis  To derive masking thresholds to determine which signal components can be eliminated  To generate a representation of the signal to which masking thresholds can be applied  Spectral Analysis is done through transforms or filter banks
  • 13. 13 Spectral Analysis  Transforms Fast Fourier Transform (FFT) Discrete Cosine Transform (DCT) - similar to FFT but uses cosine values only Modified Discrete Cosine Transform (MDCT) [used by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3] – overlapped and windowed version of DCT
  • 14. 14 Spectral Analysis  Filter Banks  a filter bank is an array of band-pass filters that separates the input signal into multiple components, each one carrying a single frequency subband of the original signal  Time sample blocks are passed through a set of bandpass filters  Masking thresholds are applied to resulting frequency subband signals  Poly-phase and wavelet banks are most popular filter structures
  • 15. 15 Filter Bank Structures  Polyphase Filter Bank [used in all of the MPEG-1 encoders] Signal is separated into subbands, the widths of which are equal over the entire frequency range The resulting subband signals are downsampled to create shorter signals (which are later reconstructed during decoding process)
  • 16. 16 Filter Bank Structures  Wavelet Filter Bank [used by Enhanced Perceptual Audio Coder (EPAC) by Lucent] Unlike polyphase filter, the widths of the subbands are not evenly spaced (narrower for higher frequencies) This allows for better time resolution (ex. short attacks), but at expense of frequency resolution
  • 17. 17 Noise Allocation  System Task: derive and apply shifted hearing threshold to the input signal  Anything below the threshold doesn’t need to be transmitted  Any noise below the threshold is irrelevant  Frequency component quantization  Tradeoff between space and noise  Encoder saves on space by using just enough bits for each frequency component to keep noise under the threshold - this is known as noise allocation
  • 18. 18 Noise Allocation  Pre-echo  In case a single audio block contains silence followed by a loud attack, pre-echo error occurs - there will be audible noise in the silent part of the block after decoding  This is avoided by pre-monitoring audio data at encoding stage and separating audio into shorter blocks in potential pre-echo case  This does not completely eliminate pre-echo, but can make it short enough to be masked by the attack (temporal masking)
  • 19. 19 Additional Encoding Techniques  Other encoding techniques techniques are available (alternative or in combination) Predictive Coding Coupling / Delta Encoding Huffman Encoding
  • 20. 20 Additional Encoding Techniques  Predictive Coding  Often used in speech and image compression  Estimates the expected value for each sample based on previous sample values  Transmits/stores the difference between the expected and received value  Generates an estimate for the next sample and then adjusts it by the difference stored for the current sample  Used for additional compression in MPEG2 AAC (Advance audio Coding)
  • 21. 21 Additional Encoding Techniques  Coupling / Delta encoding  Used in cases where audio signal consists of two or more channels (stereo or surround sound)  Similarities between channels are used for compression  A sum and difference between two channels are derived; difference is usually some value close to zero and therefore requires less space to encode  This is a case of lossless encoding process
  • 22. 22 Additional Encoding Techniques  Huffman Coding  Information-theory-based technique  An element of a signal that often reoccurs in the signal is represented by a simpler symbol, and its value is stored in a look-up table  Implemented using a look-up tables in encoder and in decoder  Provides substantial lossless compression, but requires high computational power and therefore is not very popular  Used by MPEG1 and MPEG2 AAC
  • 23. 23 Encoding - Final Stages  Audio data packed into frames  Frames stored or transmitted

Editor's Notes

  1. Hello, Today I will talk about the common techniques commonly used for digital audio compression of various audio filetype formats.
  2. -I will discuss the difference between redundant and irrelevant further in my presentation. -Depending on storage or transmission, there is an optimization in size