SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Mathematical 
Preliminaries 
1
 The development of data compression algorithms 
for a variety of data can be divided into two phases. 
 Modeling 
 Coding 
 In modeling phase we try to extract information 
about any redundancy that exists in the data and 
describe the redundancy in the form of a model. 
2
 A description of the model and a “description” of 
how the data differ from the model are coded, 
generally using a binary alphabet. 
 The difference between the data and the model is 
often referred to as the residual. 
 In the following three examples we will look at three 
different ways that data can be modeled. We will 
then use the model to obtain compression. 
3
 If binary representations of these numbers is to be 
transmitted or stored, we would need to use 5 bits per 
sample. 
 By exploiting the structure in the data, we can represent 
 the sequence using fewer bits. 
 If we plot this data as shown in following Figure 
4
 We see that the data 
seem to fall on a straight 
line. 
 A model for the data 
could therefore be a 
straight line given by the 
equation: 
5
 To examine the difference between the data and the 
model. The difference (or residual) is computed: 
 En =An − Ān = 0 1 0 −1 1 −1 0 1 −1 −1 1 1 
 The residual sequence consists of only three numbers −1 0 1. 
 If we assign a code of 00 to −1, a code of 01 to 0, and a code of 
10 to 1, we need to use 2 bits to represent each element of the 
residual sequence. 
 Therefore, we can obtain compression by transmitting or 
storing the parameters of the model and the residual 
sequence. 
 The encoding can be exact if the required compression is to be 
lossless, or approximate if the compression can be lossy. 
6
7
 The number of distinct values has been reduced. 
 Fewer bits are required to represent each number and 
compression is achieved. 
 The decoder adds each received value to the previous decoded 
value to obtain the reconstruction corresponding to the received 
value. 
8
 The sequence is made up of eight different symbols. 
 we need to use 3 bits per symbol. 
 Say we have assigned a codeword with only a single 
bit to the symbol that occurs most often, and 
correspondingly longer codewords to symbols that 
occur less often. 
9
10
 If we substitute the codes for each symbol, 
we will use 106 bits to encode the entire 
sequence. 
 As there are 41 symbols in the sequence, this 
works out to approximately 2.58 bits per 
symbol. 
 This means we have obtained a compression 
ratio of 1.16:1. 
11
 Information 
 Amount of Information 
 Entropy 
 Maximum Entropy 
 Condition for maximum entropy 
12
 Compression is achieved by removing data redundancy 
while preserving information content. 
 The information content of a group of bytes (a 
message). 
 Entropy is the measure of information content in a 
message. 
 Messages with higher entropy carry more information 
than messages with lower entropy. 
 Data with low entropy permit a larger compression ratio 
than data with high entropy. 
13
 How to determine the entropy 
 Find the probability p(x) of symbol x in the message 
 The entropy H(x) of the symbol x is: 
H(x) = - p(x) • log2p(x) 
 The average entropy over the entire message is the 
sum of the entropy of all n symbols in the message. 
14
15

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Data compression
Data compressionData compression
Data compression
 
Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...
 
Data compression
Data compressionData compression
Data compression
 
Data compression
Data compressionData compression
Data compression
 
Data Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length EncodingData Compression - Text Compression - Run Length Encoding
Data Compression - Text Compression - Run Length Encoding
 
data compression.
data compression.data compression.
data compression.
 
Data compression
Data  compressionData  compression
Data compression
 
Data Compression
Data CompressionData Compression
Data Compression
 
Arithmetic Coding
Arithmetic CodingArithmetic Coding
Arithmetic Coding
 
4 data compression
4 data compression4 data compression
4 data compression
 
Introduction for Data Compression
Introduction for Data Compression Introduction for Data Compression
Introduction for Data Compression
 
lecture on data compression
lecture on data compressionlecture on data compression
lecture on data compression
 
Text compression
Text compressionText compression
Text compression
 
Compression techniques
Compression techniquesCompression techniques
Compression techniques
 
Chapter%202%20 %20 Text%20compression(2)
Chapter%202%20 %20 Text%20compression(2)Chapter%202%20 %20 Text%20compression(2)
Chapter%202%20 %20 Text%20compression(2)
 
Lossless
LosslessLossless
Lossless
 
Lzw coding technique for image compression
Lzw coding technique for image compressionLzw coding technique for image compression
Lzw coding technique for image compression
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Dictionary Based Compression
Dictionary Based CompressionDictionary Based Compression
Dictionary Based Compression
 
Interpixel redundancy
Interpixel redundancyInterpixel redundancy
Interpixel redundancy
 

Ähnlich wie Data Compression Algorithms Explained

Lecture Notes: EEEC6440315 Communication Systems - Information Theory
Lecture Notes:  EEEC6440315 Communication Systems - Information TheoryLecture Notes:  EEEC6440315 Communication Systems - Information Theory
Lecture Notes: EEEC6440315 Communication Systems - Information TheoryAIMST University
 
Implementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataImplementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataBRNSSPublicationHubI
 
4_Datalink__Error_Detection_and Correction.pdf
4_Datalink__Error_Detection_and Correction.pdf4_Datalink__Error_Detection_and Correction.pdf
4_Datalink__Error_Detection_and Correction.pdfkenilpatel65
 
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...Helan4
 
Neuro genetic key based recursive modulo 2 substitution using mutated charact...
Neuro genetic key based recursive modulo 2 substitution using mutated charact...Neuro genetic key based recursive modulo 2 substitution using mutated charact...
Neuro genetic key based recursive modulo 2 substitution using mutated charact...ijcsity
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptPrincessSaro
 
Introduction to data compression.pptx
Introduction to data compression.pptxIntroduction to data compression.pptx
Introduction to data compression.pptxjesna40
 
Advanced compression technique for random data
Advanced compression technique for random dataAdvanced compression technique for random data
Advanced compression technique for random dataijitjournal
 
Data representation computer architecture
Data representation  computer architectureData representation  computer architecture
Data representation computer architecturestudy cse
 
arithmetic coding for signal and multimedia processing
arithmetic coding for signal and multimedia processingarithmetic coding for signal and multimedia processing
arithmetic coding for signal and multimedia processingKaransingh415757
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding09lavee
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisRamakant Soni
 
chapter one && two.pdf
chapter one && two.pdfchapter one && two.pdf
chapter one && two.pdfmiftah88
 
07 Data Link LayerError Control.pdf
07 Data Link LayerError Control.pdf07 Data Link LayerError Control.pdf
07 Data Link LayerError Control.pdfbaysahcmjames2kblax
 

Ähnlich wie Data Compression Algorithms Explained (20)

Lecture Notes: EEEC6440315 Communication Systems - Information Theory
Lecture Notes:  EEEC6440315 Communication Systems - Information TheoryLecture Notes:  EEEC6440315 Communication Systems - Information Theory
Lecture Notes: EEEC6440315 Communication Systems - Information Theory
 
Lecft3data
Lecft3dataLecft3data
Lecft3data
 
Implementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text DataImplementation of Lossless Compression Algorithms for Text Data
Implementation of Lossless Compression Algorithms for Text Data
 
4_Datalink__Error_Detection_and Correction.pdf
4_Datalink__Error_Detection_and Correction.pdf4_Datalink__Error_Detection_and Correction.pdf
4_Datalink__Error_Detection_and Correction.pdf
 
B041306015
B041306015B041306015
B041306015
 
Compression ii
Compression iiCompression ii
Compression ii
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
 
Neuro genetic key based recursive modulo 2 substitution using mutated charact...
Neuro genetic key based recursive modulo 2 substitution using mutated charact...Neuro genetic key based recursive modulo 2 substitution using mutated charact...
Neuro genetic key based recursive modulo 2 substitution using mutated charact...
 
Huffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.pptHuffman&Shannon-multimedia algorithms.ppt
Huffman&Shannon-multimedia algorithms.ppt
 
Introduction to data compression.pptx
Introduction to data compression.pptxIntroduction to data compression.pptx
Introduction to data compression.pptx
 
Advanced compression technique for random data
Advanced compression technique for random dataAdvanced compression technique for random data
Advanced compression technique for random data
 
Data representation computer architecture
Data representation  computer architectureData representation  computer architecture
Data representation computer architecture
 
arithmetic coding for signal and multimedia processing
arithmetic coding for signal and multimedia processingarithmetic coding for signal and multimedia processing
arithmetic coding for signal and multimedia processing
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Cs8591 Computer Networks
Cs8591 Computer NetworksCs8591 Computer Networks
Cs8591 Computer Networks
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
chapter one && two.pdf
chapter one && two.pdfchapter one && two.pdf
chapter one && two.pdf
 
07 Data Link LayerError Control.pdf
07 Data Link LayerError Control.pdf07 Data Link LayerError Control.pdf
07 Data Link LayerError Control.pdf
 

Kürzlich hochgeladen

An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 

Kürzlich hochgeladen (20)

An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 

Data Compression Algorithms Explained

  • 2.  The development of data compression algorithms for a variety of data can be divided into two phases.  Modeling  Coding  In modeling phase we try to extract information about any redundancy that exists in the data and describe the redundancy in the form of a model. 2
  • 3.  A description of the model and a “description” of how the data differ from the model are coded, generally using a binary alphabet.  The difference between the data and the model is often referred to as the residual.  In the following three examples we will look at three different ways that data can be modeled. We will then use the model to obtain compression. 3
  • 4.  If binary representations of these numbers is to be transmitted or stored, we would need to use 5 bits per sample.  By exploiting the structure in the data, we can represent  the sequence using fewer bits.  If we plot this data as shown in following Figure 4
  • 5.  We see that the data seem to fall on a straight line.  A model for the data could therefore be a straight line given by the equation: 5
  • 6.  To examine the difference between the data and the model. The difference (or residual) is computed:  En =An − Ān = 0 1 0 −1 1 −1 0 1 −1 −1 1 1  The residual sequence consists of only three numbers −1 0 1.  If we assign a code of 00 to −1, a code of 01 to 0, and a code of 10 to 1, we need to use 2 bits to represent each element of the residual sequence.  Therefore, we can obtain compression by transmitting or storing the parameters of the model and the residual sequence.  The encoding can be exact if the required compression is to be lossless, or approximate if the compression can be lossy. 6
  • 7. 7
  • 8.  The number of distinct values has been reduced.  Fewer bits are required to represent each number and compression is achieved.  The decoder adds each received value to the previous decoded value to obtain the reconstruction corresponding to the received value. 8
  • 9.  The sequence is made up of eight different symbols.  we need to use 3 bits per symbol.  Say we have assigned a codeword with only a single bit to the symbol that occurs most often, and correspondingly longer codewords to symbols that occur less often. 9
  • 10. 10
  • 11.  If we substitute the codes for each symbol, we will use 106 bits to encode the entire sequence.  As there are 41 symbols in the sequence, this works out to approximately 2.58 bits per symbol.  This means we have obtained a compression ratio of 1.16:1. 11
  • 12.  Information  Amount of Information  Entropy  Maximum Entropy  Condition for maximum entropy 12
  • 13.  Compression is achieved by removing data redundancy while preserving information content.  The information content of a group of bytes (a message).  Entropy is the measure of information content in a message.  Messages with higher entropy carry more information than messages with lower entropy.  Data with low entropy permit a larger compression ratio than data with high entropy. 13
  • 14.  How to determine the entropy  Find the probability p(x) of symbol x in the message  The entropy H(x) of the symbol x is: H(x) = - p(x) • log2p(x)  The average entropy over the entire message is the sum of the entropy of all n symbols in the message. 14
  • 15. 15