SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Turn off the lights
Clip End
Data Compression
Muhammad Raza Master (B12101085)
Muhammad Ali Mehmood (B12101065)
Syed Faraz Naqvi (B12101123)
-Department of Computer Science, University of Karachi
Reduction in size of data
Save storage when saving information
Save time when communicating
information
Compression
Lossless Lossy
• Image Compression
• Audio Compression
• Video compression
• All Sort of Data
Compression
TREE
• Sum of children’s frequency
• Reference of B-Tree(0/1)
* Char variable * Frequency * Reference of B-Tree(0/1)
APPLICATION
• Find an object with a certain property in a collection of
objects of a certain type
• Items in a list be stored so that an item can be easily
located
• Efficient encoding of set of characters by bit strings
TRAVERSING IN TREE
• IN-ORDER TRAVERSAL
• PREORDER TRAVERSAL
• POSTORDER TRAVERSAL
4 12 18 24
10 22
31 44 66 90
35 70
15 50
25
Pre-Order In-Order Post-order
1. Visit the root Traverse the left subtree Traverse the left subtree
2. Traverse the left subree Visit the root Traverse the right subtree
3. Traverse the right subtree Traverse the right subtree Visit the root
Pre-Order: 25, 15, 10, 4, 12, 22, 18, 24, 50, 35, 31, 44, 70, 66, 90
In-Order: 4, 10, 12, 15, 18, 22, 24, 25, 31, 35, 44, 50, 66, 70, 90
Post Order: 4, 12, 10, 18, 24, 22, 15, 31, 44, 35, 66, 90, 70, 50, 25
• By Dr. David Huffman (1952)
• First data compression algorithm
• An example of ‘LOSSLESS DATA COMPRESSION’
• Binary tree is used to construct Huffman encoding
algorithm
Introduction
Basic Idea
Largest occurring char has the least encoded bit.
Save bits by encoding frequently used characters with
fewer bits than rarely used characters
HUFFMAN(X)
• Compute frequency f(c) for each character c in X.
• Let Q be an empty priority queue
• Insert every character c into Q as singleton trees
with key f(c)
• while Q.SIZE() > 1
– Do
• f1 ← Q.MIN-KEY()
• T1 ← Q.REMOVE-MIN()
• f2 ← Q.MIN-KEY()
• T2 ← Q.REMOVE-MIN()
• Let T be a new tree with left subtree T1 and right subtree T2
• Q.INSERT(T, f1 + f2)
• Return Q.REMOVE-MIN()
it was the best of times it was the worst of times.
Symbol Count
LF 1
b 1
r 1
f 2
h 2
m 2
a 2
w 3
o 3
i 4
e 5
s 6
t 8
space 11
(full stop) = LF
Example:
Symbol Bits
LF 101010
b 101011
r 10100
f 11000
h 11001
m 11010
a 11011
w 0010
o 0011
i 1011
e 000
s 100
t 111
space 01
Example#1:
HumeraTariq
Symbol Count
H 1
u 1
m 1
e 1
r 2
a 2
T 1
I 1
q 1
H u m e T i
2 2 2 q
4 3 r a
7 4
11
0 1
1
1
11
1
1
10
0
0
0
00
0
m = HumeraTariq
Symbol Bits
H 0000
u 0001
m 0010
e 0011
r 10
a 11
T 0100
i 0101
q 0110
Compressed Bit-stream
C(m) = 000000010010001110110100111001010110
The length of the encoded bit-stream is the sum over all
letters of the number of occurrences times the number of
bits per occurrence
Compressed bit-stream = frequency * Distance
E.g: m= HumeraTariq
• At distance:
– 4: six leaf (‘H’, ‘u’, ‘m’, ‘e’, ‘T’, ‘i’, with total
frequency 6)
– 3: one leaf (‘q’, with frequency 1)
– 2: two leaf nodes (‘r’ and ‘a’, with total frequency
4)
• Compressed bit-stream = frequency * Distance
• total = 4·6 + 3·1 + 2.4 = 35 is the length of compressed
bit-stream as expected
Proved!!
Let d be the number of symbols, n be the length of the
input
Huffman’s algorithm runs in O(n + d log d) time
We can apply it to any bytestream
Milestone of LZW compression
REFERENCES
• Robert Sedgewick and Kevin Wayne - Algorithms, (4th edition)
• https://blog.itu.dk/BADS-F2009/files/2009/04/46-huffman.pdf
• Discrete Mathematics and Its Applications (7th Edition-Rosen)
Data Compression (Huffman)

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (7)

Delta encoding in data compression by Nisha Menon K studying mtech at fisat
Delta encoding in data compression by Nisha Menon K studying mtech at fisat Delta encoding in data compression by Nisha Menon K studying mtech at fisat
Delta encoding in data compression by Nisha Menon K studying mtech at fisat
 
Oracle ERP
Oracle ERPOracle ERP
Oracle ERP
 
ppt
pptppt
ppt
 
Data comparation
Data comparationData comparation
Data comparation
 
3 mathematical priliminaries DATA compression
3 mathematical priliminaries DATA compression3 mathematical priliminaries DATA compression
3 mathematical priliminaries DATA compression
 
Microsoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_PresMicrosoft PowerPoint - WirelessCluster_Pres
Microsoft PowerPoint - WirelessCluster_Pres
 
Energy-Aware Wireless Video Streaming
Energy-Aware Wireless Video StreamingEnergy-Aware Wireless Video Streaming
Energy-Aware Wireless Video Streaming
 

Kürzlich hochgeladen

Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Klinik kandungan
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 

Kürzlich hochgeladen (20)

Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 

Data Compression (Huffman)

  • 1. Turn off the lights
  • 2.
  • 4. Data Compression Muhammad Raza Master (B12101085) Muhammad Ali Mehmood (B12101065) Syed Faraz Naqvi (B12101123) -Department of Computer Science, University of Karachi
  • 5.
  • 7.
  • 8. Save storage when saving information Save time when communicating information
  • 9.
  • 11.
  • 12. • Image Compression • Audio Compression • Video compression • All Sort of Data Compression
  • 13.
  • 14. TREE • Sum of children’s frequency • Reference of B-Tree(0/1) * Char variable * Frequency * Reference of B-Tree(0/1)
  • 15. APPLICATION • Find an object with a certain property in a collection of objects of a certain type • Items in a list be stored so that an item can be easily located • Efficient encoding of set of characters by bit strings
  • 16. TRAVERSING IN TREE • IN-ORDER TRAVERSAL • PREORDER TRAVERSAL • POSTORDER TRAVERSAL
  • 17. 4 12 18 24 10 22 31 44 66 90 35 70 15 50 25 Pre-Order In-Order Post-order 1. Visit the root Traverse the left subtree Traverse the left subtree 2. Traverse the left subree Visit the root Traverse the right subtree 3. Traverse the right subtree Traverse the right subtree Visit the root Pre-Order: 25, 15, 10, 4, 12, 22, 18, 24, 50, 35, 31, 44, 70, 66, 90 In-Order: 4, 10, 12, 15, 18, 22, 24, 25, 31, 35, 44, 50, 66, 70, 90 Post Order: 4, 12, 10, 18, 24, 22, 15, 31, 44, 35, 66, 90, 70, 50, 25
  • 18.
  • 19. • By Dr. David Huffman (1952) • First data compression algorithm • An example of ‘LOSSLESS DATA COMPRESSION’ • Binary tree is used to construct Huffman encoding algorithm Introduction
  • 20. Basic Idea Largest occurring char has the least encoded bit. Save bits by encoding frequently used characters with fewer bits than rarely used characters
  • 21.
  • 22. HUFFMAN(X) • Compute frequency f(c) for each character c in X. • Let Q be an empty priority queue • Insert every character c into Q as singleton trees with key f(c) • while Q.SIZE() > 1 – Do • f1 ← Q.MIN-KEY() • T1 ← Q.REMOVE-MIN() • f2 ← Q.MIN-KEY() • T2 ← Q.REMOVE-MIN() • Let T be a new tree with left subtree T1 and right subtree T2 • Q.INSERT(T, f1 + f2) • Return Q.REMOVE-MIN()
  • 23. it was the best of times it was the worst of times. Symbol Count LF 1 b 1 r 1 f 2 h 2 m 2 a 2 w 3 o 3 i 4 e 5 s 6 t 8 space 11 (full stop) = LF Example:
  • 24.
  • 25. Symbol Bits LF 101010 b 101011 r 10100 f 11000 h 11001 m 11010 a 11011 w 0010 o 0011 i 1011 e 000 s 100 t 111 space 01
  • 26.
  • 27. Example#1: HumeraTariq Symbol Count H 1 u 1 m 1 e 1 r 2 a 2 T 1 I 1 q 1 H u m e T i 2 2 2 q 4 3 r a 7 4 11 0 1 1 1 11 1 1 10 0 0 0 00 0
  • 28. m = HumeraTariq Symbol Bits H 0000 u 0001 m 0010 e 0011 r 10 a 11 T 0100 i 0101 q 0110 Compressed Bit-stream C(m) = 000000010010001110110100111001010110
  • 29.
  • 30. The length of the encoded bit-stream is the sum over all letters of the number of occurrences times the number of bits per occurrence Compressed bit-stream = frequency * Distance
  • 31.
  • 32. E.g: m= HumeraTariq • At distance: – 4: six leaf (‘H’, ‘u’, ‘m’, ‘e’, ‘T’, ‘i’, with total frequency 6) – 3: one leaf (‘q’, with frequency 1) – 2: two leaf nodes (‘r’ and ‘a’, with total frequency 4) • Compressed bit-stream = frequency * Distance • total = 4·6 + 3·1 + 2.4 = 35 is the length of compressed bit-stream as expected Proved!!
  • 33.
  • 34. Let d be the number of symbols, n be the length of the input Huffman’s algorithm runs in O(n + d log d) time
  • 35.
  • 36. We can apply it to any bytestream Milestone of LZW compression
  • 37. REFERENCES • Robert Sedgewick and Kevin Wayne - Algorithms, (4th edition) • https://blog.itu.dk/BADS-F2009/files/2009/04/46-huffman.pdf • Discrete Mathematics and Its Applications (7th Edition-Rosen)