SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
Python & Perl
Lecture 06

Department of Computer Science
Utah State University
Outline
●

●

Data Abstraction: Building Huffman Trees with Lists
and Tuples
List Comprehension
Data Abstraction
Building Huffman Trees with Lists and
Tuples
Background
●

●

●

In information theory, coding refers to methods that
represent data in terms of bit sequences (sequences
of 0's and 1's)
Encoding is a method of taking data structures and
mapping them to bit sequences
Decoding is a method of taking bit sequences and
outputting the corresponding data structure
Example: Standard ASCII & Unicode
●

Standard ASCII encodes each character as a 7-bit sequence

●

Using 7 bits allows us to encode 27 possible characters

●

●

●

Unicode has three standards: UTF-8 (uses 8-bit sequences),
UTF-16 (uses 16-bit sequences), and UTF-32 (uses 32-bit
sequences)
UTF stands for Unicode Transformation Format
Python 2.X's Unicode support: “Python represents Unicode strings as either 16- or 32-bit integers), depending on how the Python interpreter was compiled.”
Two Types of Codes
●

●

●

There are two types of codes: fixed-length and variable-length
Fixed-length (e.g., ASCII, Unicode) codes encode every
character in terms of the same number of bits
Variable-length codes (e.g., Morse, Huffman) encode characters in terms of variable numbers of bits: more frequent symbols are encoded with fewer bits
Example: Fixed-Length Code
●

A – 000

C – 010

E – 100

G – 110

●

B – 001

D – 011

F – 101

H – 111

●

AADF = 000000011101

●

The encoding of AADF is 12 bits
Example: Variable-Length Code
●

A–0

C – 1010

●

B – 100

●

AADF = 0010111101

●

The encoding of AADF is 10 bits

D – 1011

E – 1100
F – 1101

G – 1110
H – 1111
End of Character in Variable-Length Code
●

●

●

One of the challenges in variable-length codes is knowing
where one character ends and the one begins
Morse uses a special character (separator code)
Prefix coding is another solution: the prefix of every
character is unique – no code of any character
starts another character
Huffman Code
●

●

●

●

Huffman code is a variable-length code that takes advantage of relative frequencies of characters
Huffman code is named after David Huffman, the researcher who discovered it
Huffman code is represented as a binary tree where leaves
are individual characters and their frequencies
Each non-leaf node is a set of characters in all of its subnodes and the sum of their relative frequencies
Huffman Tree Example
{A, B, C, D, E, F, G, H}: 17
1

0
A: 8

{B, C, D, E, F, G, H}: 9
1

0

{E, F, G, H}: 4

{B, C, D}: 5
1

0

{C, D}: 2
B: 3
0
C: 1

1

0

1

D: 1

{G, H}: 2

{E, F}: 2
0
E: 1

1

F: 1

0

G: 1

1

H: 1
Using Huffman Tree to Encode/Decode
Characters
●

The tree on the previous slide, these are the encodings:


A is encoded as 0



B is encoded as 100



C is encoded as 1010



D is encoded as 1011



E is encoded as 1100



F is encoded as 1101



G is encoded as 1110



H is encoded as 1111
Building The Huffman Tree
Simple Huffman Tree

{A, B, D, C}: 8
{B, D, C}: 4
A: 4
{D, C}: 2

B: 2
D: 1

C: 1
Constructing Leaves
### a leaf is a tuple whose first element is symbol
### represented as a string and whose second element is
### the symbol's frequency
def make_leaf(symbol, freq):
return (symbol, freq)
def is_leaf(x):
return isinstance(x, tuple) and 
len(x) == 2 and 
isinstance(x[0], str) and 
isinstance(x[1], int)
Constructing Leaves
### return the character (symbol) of the leaf
def get_leaf_symbol(leaf):
return leaf[0]
### return the frequency of the leaf's character
def get_leaf_freq(leaf):
return leaf[1]
Constructing Huffman Trees
### A Non-Leaf node (internal node) is represented as
### a list of four elements:
### 1. left brach
### 2. right branch
### 3. list of symbols
### 4. combined frequency of symbols
[left_branch, right_branch, symbols, frequency]
Accessing Huffman Trees
def get_leaf_symbol(leaf):
return leaf[0]
def get_leaf_freq(leaf):
return leaf[1]
def get_left_branch(huff_tree):
return huff_tree[0]
def get_right_branch(huff_tree):
return huff_tree[1]
Accessing Huffman Trees
def get_symbols(huff_tree):
if is_leaf(huff_tree):
return [get_leaf_symbol(huff_tree)]
else:
return huff_tree[2]
def get_freq(huff_tree):
if is_leaf(huff_tree):
return get_leaf_freq(huff_tree)
else:
return huff_tree[3]
Constructing Huffman Trees
### A Huffman tree is constructed from its left branch, which can
### be a huffman tree or a leaf, and its right branch, another
### huffman tree or a leaf. The new tree has the symbols of the
### left branch and the right branch and the frequency of the left
### branch and the right branch
def make_huffman_tree(left_branch, right_branch):
return [left_branch,
right_branch,
get_symbols(left_branch) + get_symbols(right_branch),
get_freq(left_branch) + get_freq(right_branch)]
MAKE_HUFFMAN_TREE Example
ht01 = make_huffman_tree(make_leaf('A', 4),
make_huffman_tree(make_leaf('B', 2),
make_huffman_tree(make_leaf('D', 1),

make_leaf('C', 1))))

{A, B, D, C}: 8
{B, D, C}: 4

A: 4

{D, C}: 2

B: 2
D: 1

C: 1
MAKE_HUFFMAN_TREE Example
Python data structure that represents the Huffman tree below:
[('A', 4),
[('B', 2), [('D', 1), ('C', 1), ['D', 'C'], 2], ['B', 'D', 'C'], 4],
['A', 'B', 'D', 'C'],
8]

{A, B, D, C}: 8
{B, D, C}: 4

A: 4

{D, C}: 2

B: 2
D: 1

C: 1
Customizing sort()
def leaf_freq_comp(leaf1, leaf2):
return cmp(get_leaf_freq(leaf1),
get_leaf_freq(leaf2))

huff_leaves = [make_leaf('A', 8), make_leaf('C', 1), make_leaf('B', 3),
make_leaf('D', 1), make_leaf('F', 1), make_leaf('E', 1),
make_leaf('H', 1), make_leaf('G', 1)]

print huff_leaves
huff_leaves.sort(leaf_freq_comp)
OUTPUT:
[('A', 8), ('C', 1), ('B', 3), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1)]
[('C', 1), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1), ('B', 3), ('A', 8)]
Customizing sort()
def leaf_symbol_comp(leaf1, leaf2):
return cmp(get_leaf_symbol(leaf1),
get_leaf_symbol(leaf2))

huff_leaves2 = [make_leaf('A', 8), make_leaf('C', 1), make_leaf('B', 3),
make_leaf('D', 1), make_leaf('F', 1), make_leaf('E', 1),
make_leaf('H', 1), make_leaf('G', 1)]

print huff_leaves2
huff_leaves2.sort(leaf_symbol_comp)
print huff_leaves2
OUTPUT:
[('A', 8), ('C', 1), ('B', 3), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1)]
[('A', 8), ('B', 3), ('C', 1), ('D', 1), ('E', 1), ('F', 1), ('G', 1), ('H', 1)]
Encoding & Decoding Messages with
Huffman Trees
Sample Huffman Tree
{A, B, C, D, E, F, G, H}: 17
1

0

{B, C, D, E, F, G, H}: 9

A: 8

1

0

{E, F, G, H}: 4

{B, C, D}: 5
1

0

{C, D}: 2
B: 3
0
C: 1

1

0

1

D: 1

{G, H}: 2

{E, F}: 2
0
E: 1

1

F: 1

0

G: 1

1

H: 1
Symbol Encoding
1. Given a symbol s and a Huffman tree ht, set current_node to the root
node and encoding to an empty list (you can also check if s is in the root
node's symbol leaf and, if not, signal error)
2. If current_node is a leaf, return encoding
3. Check if s is in current_node's left branch or right branch
4. If in the left, add 0 to encoding, set current_node to the root of the left
branch, and go to step 2
5. If in the right, add 1 to encoding, set current_node to the root of the
right branch, and go to step 2
6. If in neither branch, signal error
Example
●

Encode B with the sample Huffman tree

●

Set current_node to the root node

●

●

●

●

B is in current_node's the right branch, so add 1 to encoding &
recurse into the right branch (current_node is set to the root of the
right branch – {B, C, D, E, F, G, H}: 9)
B is in current_node's left branch, so add 0 to encoding and recurse into the left branch (current_node is {B, C, D}: 5)
B is in current_node's left branch, so add 0 to encoding & recurse
into the left branch (current_node is B: 3)
current_node is a leaf, so return 100 (value of encoding)
Message Encoding
●

●

●

Given a sequence of symbols message and a Huffman
tree ht
Concatenate the encoding of each symbol in message
from left to right
Return the concatenation of encodings
Example
●

Encode ABBA with the sample Huffman tree

●

Encoding for A is 0

●

Encoding for B is 100

●

Encoding for B is 100

●

Encoding for A is 0

●

Concatenation of encodings is 01001000
Message Decoding
1. Given a sequence of bits message and a Huffman tree ht, set current_node to
the root and decoding to an empty list
2. If current_node is a leaf, add its symbol to decoding and set current_node to
ht's root
3. If current_node is ht's root and message has no more bits, return decoding
4. If no more bits in message & current_node is not a leaf, signal error
5. If message's current bit is 0, set current_node to its left child, read the bit, & go
to step 2
6. If message's current bit is 1, set current_node to its right child, read the bit, &
go to step 2
Example
●

●

Decode 0100 with the sample Huffman tree
Read 0, go left to A:8 & add A to decoding and reset
current_node to the root

●

Read 1, go right to {B, C, D, E, F, G, H}: 9

●

Read 0, go left to {B, C, D}:5

●

Read 0, go left to B:3

●

Add B to decoding & reset current_node to the root

●

No more bits & current_node is the root, so return AB
List Comprehension
List Comprehension
●

●

List comprehension is an syntactic construct in some
programming languages for building lists from list specifications
List comprehension derives its conceptual roots from
the set-former (set-builder) notation in mathematics
[Y for X in LIST]

●

List comprehension is available in other programming
languages such as Common Lisp, Haskell, and Ocaml
Set-Former Notation Example

4  x | x  N , x



 100
 4  x is the output function
 x is the variable
 N is the input set
2

 x  100 is the predicate
2
Set-Former Notation Examples

x  a, b | x  3is the set of all strings over a, b
*

whose length is 0, 1, 2, or 3.

a b
n

n



| n  1 is the set of non - empty strings over a, b such

that a ' s precede b' s and the number of a ' s is equal to
the number of b' s.

xy | x  a, b, y  aa, ccis the set of strings where
a or b is followed by aa or cc.
For-Loop Implementation
### building the list of the set-former example with forloop
>>> rslt = []
>>> for x in xrange(201):
if x ** 2 < 100:
rslt.append(4 * x)
>>> rslt
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
List Comprehension Equivalent
### building the same list with list comprehension
>>> s = [ 4 * x for x in xrange(201) if x ** 2 < 100]
>>> s
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
For-Loop
### building list of squares of even numbers in [0, 10]
### with for-loop
>>> rslt = []
>>> for x in xrange(11):
if x % 2 == 0:
rslt.append(x**2)
>>> rslt
[0, 4, 16, 36, 64, 100]
List Comprehension Equivalent
### building the same list with list comprehension
>>> [x ** 2 for x in xrange(11) if x % 2 == 0]
[0, 4, 16, 36, 64, 100]
For-Loop
## building list of squares of odd numbers in [0,
10]
>>> rslt = []
>>> for x in xrange(11):
if x % 2 != 0:
rslt.append(x**2)
>>> rslt
[1, 9, 25, 49, 81]
List Comprehension Equivalent
## building list of squares of odd numbers [0, 10]
## with list comprehension
>>> [x ** 2 for x in xrange(11) if x % 2 != 0]
[1, 9, 25, 49, 81]
List Comprehension with For-Loops
For-Loop
>>> rslt = []
>>> for x in xrange(6):
if x % 2 == 0:
for y in xrange(6):
if y % 2 != 0:
rslt.append((x, y))
>>> rslt
[(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4,
3), (4, 5)]
List Comprehension Equivalent
>>> [(x, y) for x in xrange(6) if x % 2 == 0 
for y in xrange(6) if y % 2 != 0]
[(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4,
3), (4, 5)]
List Comprehension with Matrices
List Comprehension with Matrices
●

List comprehension can be used to scan rows and columns in matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract all rows
>>> [r for r in matrix]
[[10, 20, 30], [40, 50, 60], [70, 80, 90]]
List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 0
>>> [r[0] for r in matrix]
[10, 40, 70]
List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 1
>>> [r[1] for r in matrix]
[20, 50, 80]
List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 2
>>> [r[2] for r in matrix]
[30, 60, 90]
List Comprehension with Matrices
### turn matrix columns into rows
>>> rslt = []
>>> for c in xrange(len(matrix)):
rslt.append([matrix[r][c]
xrange(len(matrix))])

for

>>> rslt
[[10, 40, 70], [20, 50, 80], [30, 60, 90]]

r

in
List Comprehension with Matrices
●

List comprehension can work with iterables (e.g., dictionaries)
>>> dict = {'a' : 'A', 'bb' : 'BB', 'ccc' : 'CCC'}
>>> [(item[0], item[1], len(item[0]+item[1])) 
for item in dict.items()]
[('a', 'A', 2), ('ccc', 'CCC', 6), ('bb', 'BB', 4)]
List Comprehension
●

If the expression inside [ ] is a tuple, parentheses are a must
>>> cubes = [(x, x**3) for x in xrange(5)]
>>> cubes
[(0, 0), (1, 1), (2, 8), (3, 27), (4, 64)]

●

Sequences can be unpacked in list comprehension
>>> sums = [x + y for x, y in cubes]
>>> sums
[0, 2, 10, 30, 68]
List Comprehension
●

for-clauses in list comprehensions can iterate over
any sequences:
>>> rslt = [ c * n for c in 'math' for n in (1, 2,
3)]
>>> rslt
['m', 'mm', 'mmm', 'a', 'aa', 'aaa', 't', 'tt','ttt', 'h',
'hh', 'hhh']
List Comprehension & Loop Variables
●

The loop variables used in the list comprehension for-loops
(and in regular for-loops) stay after the execution.
>>> for i in [1, 2, 3]: print i
1
2
3
>>> i + 4
7
>>> [j for j in xrange(10) if j % 2 == 0]
[0, 2, 4, 6, 8]
>>> j * 2
18
When To Use List Comprehension
●

For-loops are easier to understand and debug

●

List comprehensions may be harder to understand

●

●

●

List comprehensions are faster than for-loops in the interpreter
List comprehensions are worth using to speed up simpler
tasks
For-loops are worth using when logic gets complex
Reading & References
●

www.python.org

●

http://docs.python.org/library/stdtypes.html#typesseq

●

doc.python.org/howto/unicode.html

●

●

●

Ch 02, M. L. Hetland. Beginning Python From Novice to Professional, 2nd Ed., APRESS
Ch 02, H. Abelson and G. Sussman. Structure and Interpretation of Computer Programs, MIT Press
S. Roman, Coding and Information Theory, Springer-Verlag

Weitere ähnliche Inhalte

Was ist angesagt?

Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
mussawir20
 
Chapter Two(1)
Chapter Two(1)Chapter Two(1)
Chapter Two(1)
bolovv
 
Chapter Three(2)
Chapter Three(2)Chapter Three(2)
Chapter Three(2)
bolovv
 
Introduction_modern_fortran_short
Introduction_modern_fortran_shortIntroduction_modern_fortran_short
Introduction_modern_fortran_short
Nils van Velzen
 

Was ist angesagt? (20)

Compiler Construction | Lecture 4 | Parsing
Compiler Construction | Lecture 4 | Parsing Compiler Construction | Lecture 4 | Parsing
Compiler Construction | Lecture 4 | Parsing
 
Compiler Construction | Lecture 8 | Type Constraints
Compiler Construction | Lecture 8 | Type ConstraintsCompiler Construction | Lecture 8 | Type Constraints
Compiler Construction | Lecture 8 | Type Constraints
 
Syntaxdirected
SyntaxdirectedSyntaxdirected
Syntaxdirected
 
PHP Regular Expressions
PHP Regular ExpressionsPHP Regular Expressions
PHP Regular Expressions
 
Erlang session1
Erlang session1Erlang session1
Erlang session1
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
 
Declare Your Language: Syntax Definition
Declare Your Language: Syntax DefinitionDeclare Your Language: Syntax Definition
Declare Your Language: Syntax Definition
 
Regular expressions and php
Regular expressions and phpRegular expressions and php
Regular expressions and php
 
Chapter Two(1)
Chapter Two(1)Chapter Two(1)
Chapter Two(1)
 
Chapter Three(2)
Chapter Three(2)Chapter Three(2)
Chapter Three(2)
 
Introduction_modern_fortran_short
Introduction_modern_fortran_shortIntroduction_modern_fortran_short
Introduction_modern_fortran_short
 
programming fortran 77 Slide02
programming fortran 77 Slide02programming fortran 77 Slide02
programming fortran 77 Slide02
 
Admissions in india 2015
Admissions in india 2015Admissions in india 2015
Admissions in india 2015
 
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term RewritingCompiler Construction | Lecture 5 | Transformation by Term Rewriting
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
 
Assignment statements
Assignment statementsAssignment statements
Assignment statements
 
Parse Tree
Parse TreeParse Tree
Parse Tree
 
Data structure week 2
Data structure week 2Data structure week 2
Data structure week 2
 
Antlr V3
Antlr V3Antlr V3
Antlr V3
 
Regex posix
Regex posixRegex posix
Regex posix
 
Parsing
ParsingParsing
Parsing
 

Ähnlich wie Python lecture 06

For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docx
alfred4lewis58146
 

Ähnlich wie Python lecture 06 (20)

Compilers Design
Compilers DesignCompilers Design
Compilers Design
 
Huffman coding || Huffman Tree
Huffman coding || Huffman TreeHuffman coding || Huffman Tree
Huffman coding || Huffman Tree
 
Huffman coding || Huffman Tree
Huffman coding || Huffman TreeHuffman coding || Huffman Tree
Huffman coding || Huffman Tree
 
ANSI C REFERENCE CARD
ANSI C REFERENCE CARDANSI C REFERENCE CARD
ANSI C REFERENCE CARD
 
C cheat sheet for varsity (extreme edition)
C cheat sheet for varsity (extreme edition)C cheat sheet for varsity (extreme edition)
C cheat sheet for varsity (extreme edition)
 
Cs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer KeyCs6660 compiler design may june 2016 Answer Key
Cs6660 compiler design may june 2016 Answer Key
 
CC Week 11.ppt
CC Week 11.pptCC Week 11.ppt
CC Week 11.ppt
 
Intermediate code generation
Intermediate code generationIntermediate code generation
Intermediate code generation
 
Tut1
Tut1Tut1
Tut1
 
Compiler chapter six .ppt course material
Compiler chapter six .ppt course materialCompiler chapter six .ppt course material
Compiler chapter six .ppt course material
 
Morel, a Functional Query Language
Morel, a Functional Query LanguageMorel, a Functional Query Language
Morel, a Functional Query Language
 
Functions class11 cbse_notes
Functions class11 cbse_notesFunctions class11 cbse_notes
Functions class11 cbse_notes
 
Morel, a data-parallel programming language
Morel, a data-parallel programming languageMorel, a data-parallel programming language
Morel, a data-parallel programming language
 
T03 a basicioprintf
T03 a basicioprintfT03 a basicioprintf
T03 a basicioprintf
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docx
 
C language
C languageC language
C language
 
Data structures final lecture 1
Data structures final  lecture 1Data structures final  lecture 1
Data structures final lecture 1
 
Ch8a
Ch8aCh8a
Ch8a
 
C_Programming_Language_tutorial__Autosaved_.pptx
C_Programming_Language_tutorial__Autosaved_.pptxC_Programming_Language_tutorial__Autosaved_.pptx
C_Programming_Language_tutorial__Autosaved_.pptx
 
Karakter dan String
Karakter dan StringKarakter dan String
Karakter dan String
 

Mehr von Tanwir Zaman

Mehr von Tanwir Zaman (15)

Cs3430 lecture 17
Cs3430 lecture 17Cs3430 lecture 17
Cs3430 lecture 17
 
Cs3430 lecture 15
Cs3430 lecture 15Cs3430 lecture 15
Cs3430 lecture 15
 
Cs3430 lecture 14
Cs3430 lecture 14Cs3430 lecture 14
Cs3430 lecture 14
 
Cs3430 lecture 13
Cs3430 lecture 13Cs3430 lecture 13
Cs3430 lecture 13
 
Cs3430 lecture 16
Cs3430 lecture 16Cs3430 lecture 16
Cs3430 lecture 16
 
Python lecture 12
Python lecture 12Python lecture 12
Python lecture 12
 
Python lecture 10
Python lecture 10Python lecture 10
Python lecture 10
 
Python lecture 09
Python lecture 09Python lecture 09
Python lecture 09
 
Python lecture 8
Python lecture 8Python lecture 8
Python lecture 8
 
Python lecture 05
Python lecture 05Python lecture 05
Python lecture 05
 
Python lecture 04
Python lecture 04Python lecture 04
Python lecture 04
 
Python lecture 03
Python lecture 03Python lecture 03
Python lecture 03
 
Python lecture 02
Python lecture 02Python lecture 02
Python lecture 02
 
Python lecture 01
Python lecture 01Python lecture 01
Python lecture 01
 
Python lecture 11
Python lecture 11Python lecture 11
Python lecture 11
 

Kürzlich hochgeladen

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 

Python lecture 06

  • 1. Python & Perl Lecture 06 Department of Computer Science Utah State University
  • 2. Outline ● ● Data Abstraction: Building Huffman Trees with Lists and Tuples List Comprehension
  • 3. Data Abstraction Building Huffman Trees with Lists and Tuples
  • 4. Background ● ● ● In information theory, coding refers to methods that represent data in terms of bit sequences (sequences of 0's and 1's) Encoding is a method of taking data structures and mapping them to bit sequences Decoding is a method of taking bit sequences and outputting the corresponding data structure
  • 5. Example: Standard ASCII & Unicode ● Standard ASCII encodes each character as a 7-bit sequence ● Using 7 bits allows us to encode 27 possible characters ● ● ● Unicode has three standards: UTF-8 (uses 8-bit sequences), UTF-16 (uses 16-bit sequences), and UTF-32 (uses 32-bit sequences) UTF stands for Unicode Transformation Format Python 2.X's Unicode support: “Python represents Unicode strings as either 16- or 32-bit integers), depending on how the Python interpreter was compiled.”
  • 6. Two Types of Codes ● ● ● There are two types of codes: fixed-length and variable-length Fixed-length (e.g., ASCII, Unicode) codes encode every character in terms of the same number of bits Variable-length codes (e.g., Morse, Huffman) encode characters in terms of variable numbers of bits: more frequent symbols are encoded with fewer bits
  • 7. Example: Fixed-Length Code ● A – 000 C – 010 E – 100 G – 110 ● B – 001 D – 011 F – 101 H – 111 ● AADF = 000000011101 ● The encoding of AADF is 12 bits
  • 8. Example: Variable-Length Code ● A–0 C – 1010 ● B – 100 ● AADF = 0010111101 ● The encoding of AADF is 10 bits D – 1011 E – 1100 F – 1101 G – 1110 H – 1111
  • 9. End of Character in Variable-Length Code ● ● ● One of the challenges in variable-length codes is knowing where one character ends and the one begins Morse uses a special character (separator code) Prefix coding is another solution: the prefix of every character is unique – no code of any character starts another character
  • 10. Huffman Code ● ● ● ● Huffman code is a variable-length code that takes advantage of relative frequencies of characters Huffman code is named after David Huffman, the researcher who discovered it Huffman code is represented as a binary tree where leaves are individual characters and their frequencies Each non-leaf node is a set of characters in all of its subnodes and the sum of their relative frequencies
  • 11. Huffman Tree Example {A, B, C, D, E, F, G, H}: 17 1 0 A: 8 {B, C, D, E, F, G, H}: 9 1 0 {E, F, G, H}: 4 {B, C, D}: 5 1 0 {C, D}: 2 B: 3 0 C: 1 1 0 1 D: 1 {G, H}: 2 {E, F}: 2 0 E: 1 1 F: 1 0 G: 1 1 H: 1
  • 12. Using Huffman Tree to Encode/Decode Characters ● The tree on the previous slide, these are the encodings:  A is encoded as 0  B is encoded as 100  C is encoded as 1010  D is encoded as 1011  E is encoded as 1100  F is encoded as 1101  G is encoded as 1110  H is encoded as 1111
  • 14. Simple Huffman Tree {A, B, D, C}: 8 {B, D, C}: 4 A: 4 {D, C}: 2 B: 2 D: 1 C: 1
  • 15. Constructing Leaves ### a leaf is a tuple whose first element is symbol ### represented as a string and whose second element is ### the symbol's frequency def make_leaf(symbol, freq): return (symbol, freq) def is_leaf(x): return isinstance(x, tuple) and len(x) == 2 and isinstance(x[0], str) and isinstance(x[1], int)
  • 16. Constructing Leaves ### return the character (symbol) of the leaf def get_leaf_symbol(leaf): return leaf[0] ### return the frequency of the leaf's character def get_leaf_freq(leaf): return leaf[1]
  • 17. Constructing Huffman Trees ### A Non-Leaf node (internal node) is represented as ### a list of four elements: ### 1. left brach ### 2. right branch ### 3. list of symbols ### 4. combined frequency of symbols [left_branch, right_branch, symbols, frequency]
  • 18. Accessing Huffman Trees def get_leaf_symbol(leaf): return leaf[0] def get_leaf_freq(leaf): return leaf[1] def get_left_branch(huff_tree): return huff_tree[0] def get_right_branch(huff_tree): return huff_tree[1]
  • 19. Accessing Huffman Trees def get_symbols(huff_tree): if is_leaf(huff_tree): return [get_leaf_symbol(huff_tree)] else: return huff_tree[2] def get_freq(huff_tree): if is_leaf(huff_tree): return get_leaf_freq(huff_tree) else: return huff_tree[3]
  • 20. Constructing Huffman Trees ### A Huffman tree is constructed from its left branch, which can ### be a huffman tree or a leaf, and its right branch, another ### huffman tree or a leaf. The new tree has the symbols of the ### left branch and the right branch and the frequency of the left ### branch and the right branch def make_huffman_tree(left_branch, right_branch): return [left_branch, right_branch, get_symbols(left_branch) + get_symbols(right_branch), get_freq(left_branch) + get_freq(right_branch)]
  • 21. MAKE_HUFFMAN_TREE Example ht01 = make_huffman_tree(make_leaf('A', 4), make_huffman_tree(make_leaf('B', 2), make_huffman_tree(make_leaf('D', 1), make_leaf('C', 1)))) {A, B, D, C}: 8 {B, D, C}: 4 A: 4 {D, C}: 2 B: 2 D: 1 C: 1
  • 22. MAKE_HUFFMAN_TREE Example Python data structure that represents the Huffman tree below: [('A', 4), [('B', 2), [('D', 1), ('C', 1), ['D', 'C'], 2], ['B', 'D', 'C'], 4], ['A', 'B', 'D', 'C'], 8] {A, B, D, C}: 8 {B, D, C}: 4 A: 4 {D, C}: 2 B: 2 D: 1 C: 1
  • 23. Customizing sort() def leaf_freq_comp(leaf1, leaf2): return cmp(get_leaf_freq(leaf1), get_leaf_freq(leaf2)) huff_leaves = [make_leaf('A', 8), make_leaf('C', 1), make_leaf('B', 3), make_leaf('D', 1), make_leaf('F', 1), make_leaf('E', 1), make_leaf('H', 1), make_leaf('G', 1)] print huff_leaves huff_leaves.sort(leaf_freq_comp) OUTPUT: [('A', 8), ('C', 1), ('B', 3), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1)] [('C', 1), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1), ('B', 3), ('A', 8)]
  • 24. Customizing sort() def leaf_symbol_comp(leaf1, leaf2): return cmp(get_leaf_symbol(leaf1), get_leaf_symbol(leaf2)) huff_leaves2 = [make_leaf('A', 8), make_leaf('C', 1), make_leaf('B', 3), make_leaf('D', 1), make_leaf('F', 1), make_leaf('E', 1), make_leaf('H', 1), make_leaf('G', 1)] print huff_leaves2 huff_leaves2.sort(leaf_symbol_comp) print huff_leaves2 OUTPUT: [('A', 8), ('C', 1), ('B', 3), ('D', 1), ('F', 1), ('E', 1), ('H', 1), ('G', 1)] [('A', 8), ('B', 3), ('C', 1), ('D', 1), ('E', 1), ('F', 1), ('G', 1), ('H', 1)]
  • 25. Encoding & Decoding Messages with Huffman Trees
  • 26. Sample Huffman Tree {A, B, C, D, E, F, G, H}: 17 1 0 {B, C, D, E, F, G, H}: 9 A: 8 1 0 {E, F, G, H}: 4 {B, C, D}: 5 1 0 {C, D}: 2 B: 3 0 C: 1 1 0 1 D: 1 {G, H}: 2 {E, F}: 2 0 E: 1 1 F: 1 0 G: 1 1 H: 1
  • 27. Symbol Encoding 1. Given a symbol s and a Huffman tree ht, set current_node to the root node and encoding to an empty list (you can also check if s is in the root node's symbol leaf and, if not, signal error) 2. If current_node is a leaf, return encoding 3. Check if s is in current_node's left branch or right branch 4. If in the left, add 0 to encoding, set current_node to the root of the left branch, and go to step 2 5. If in the right, add 1 to encoding, set current_node to the root of the right branch, and go to step 2 6. If in neither branch, signal error
  • 28. Example ● Encode B with the sample Huffman tree ● Set current_node to the root node ● ● ● ● B is in current_node's the right branch, so add 1 to encoding & recurse into the right branch (current_node is set to the root of the right branch – {B, C, D, E, F, G, H}: 9) B is in current_node's left branch, so add 0 to encoding and recurse into the left branch (current_node is {B, C, D}: 5) B is in current_node's left branch, so add 0 to encoding & recurse into the left branch (current_node is B: 3) current_node is a leaf, so return 100 (value of encoding)
  • 29. Message Encoding ● ● ● Given a sequence of symbols message and a Huffman tree ht Concatenate the encoding of each symbol in message from left to right Return the concatenation of encodings
  • 30. Example ● Encode ABBA with the sample Huffman tree ● Encoding for A is 0 ● Encoding for B is 100 ● Encoding for B is 100 ● Encoding for A is 0 ● Concatenation of encodings is 01001000
  • 31. Message Decoding 1. Given a sequence of bits message and a Huffman tree ht, set current_node to the root and decoding to an empty list 2. If current_node is a leaf, add its symbol to decoding and set current_node to ht's root 3. If current_node is ht's root and message has no more bits, return decoding 4. If no more bits in message & current_node is not a leaf, signal error 5. If message's current bit is 0, set current_node to its left child, read the bit, & go to step 2 6. If message's current bit is 1, set current_node to its right child, read the bit, & go to step 2
  • 32. Example ● ● Decode 0100 with the sample Huffman tree Read 0, go left to A:8 & add A to decoding and reset current_node to the root ● Read 1, go right to {B, C, D, E, F, G, H}: 9 ● Read 0, go left to {B, C, D}:5 ● Read 0, go left to B:3 ● Add B to decoding & reset current_node to the root ● No more bits & current_node is the root, so return AB
  • 34. List Comprehension ● ● List comprehension is an syntactic construct in some programming languages for building lists from list specifications List comprehension derives its conceptual roots from the set-former (set-builder) notation in mathematics [Y for X in LIST] ● List comprehension is available in other programming languages such as Common Lisp, Haskell, and Ocaml
  • 35. Set-Former Notation Example 4  x | x  N , x   100  4  x is the output function  x is the variable  N is the input set 2  x  100 is the predicate 2
  • 36. Set-Former Notation Examples x  a, b | x  3is the set of all strings over a, b * whose length is 0, 1, 2, or 3. a b n n  | n  1 is the set of non - empty strings over a, b such that a ' s precede b' s and the number of a ' s is equal to the number of b' s. xy | x  a, b, y  aa, ccis the set of strings where a or b is followed by aa or cc.
  • 37. For-Loop Implementation ### building the list of the set-former example with forloop >>> rslt = [] >>> for x in xrange(201): if x ** 2 < 100: rslt.append(4 * x) >>> rslt [0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
  • 38. List Comprehension Equivalent ### building the same list with list comprehension >>> s = [ 4 * x for x in xrange(201) if x ** 2 < 100] >>> s [0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
  • 39. For-Loop ### building list of squares of even numbers in [0, 10] ### with for-loop >>> rslt = [] >>> for x in xrange(11): if x % 2 == 0: rslt.append(x**2) >>> rslt [0, 4, 16, 36, 64, 100]
  • 40. List Comprehension Equivalent ### building the same list with list comprehension >>> [x ** 2 for x in xrange(11) if x % 2 == 0] [0, 4, 16, 36, 64, 100]
  • 41. For-Loop ## building list of squares of odd numbers in [0, 10] >>> rslt = [] >>> for x in xrange(11): if x % 2 != 0: rslt.append(x**2) >>> rslt [1, 9, 25, 49, 81]
  • 42. List Comprehension Equivalent ## building list of squares of odd numbers [0, 10] ## with list comprehension >>> [x ** 2 for x in xrange(11) if x % 2 != 0] [1, 9, 25, 49, 81]
  • 44. For-Loop >>> rslt = [] >>> for x in xrange(6): if x % 2 == 0: for y in xrange(6): if y % 2 != 0: rslt.append((x, y)) >>> rslt [(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)]
  • 45. List Comprehension Equivalent >>> [(x, y) for x in xrange(6) if x % 2 == 0 for y in xrange(6) if y % 2 != 0] [(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4, 3), (4, 5)]
  • 47. List Comprehension with Matrices ● List comprehension can be used to scan rows and columns in matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract all rows >>> [r for r in matrix] [[10, 20, 30], [40, 50, 60], [70, 80, 90]]
  • 48. List Comprehension with Matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract column 0 >>> [r[0] for r in matrix] [10, 40, 70]
  • 49. List Comprehension with Matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract column 1 >>> [r[1] for r in matrix] [20, 50, 80]
  • 50. List Comprehension with Matrices >>> matrix = [ [10, 20, 30], [40, 50, 60], [70, 80, 90] ] ### extract column 2 >>> [r[2] for r in matrix] [30, 60, 90]
  • 51. List Comprehension with Matrices ### turn matrix columns into rows >>> rslt = [] >>> for c in xrange(len(matrix)): rslt.append([matrix[r][c] xrange(len(matrix))]) for >>> rslt [[10, 40, 70], [20, 50, 80], [30, 60, 90]] r in
  • 52. List Comprehension with Matrices ● List comprehension can work with iterables (e.g., dictionaries) >>> dict = {'a' : 'A', 'bb' : 'BB', 'ccc' : 'CCC'} >>> [(item[0], item[1], len(item[0]+item[1])) for item in dict.items()] [('a', 'A', 2), ('ccc', 'CCC', 6), ('bb', 'BB', 4)]
  • 53. List Comprehension ● If the expression inside [ ] is a tuple, parentheses are a must >>> cubes = [(x, x**3) for x in xrange(5)] >>> cubes [(0, 0), (1, 1), (2, 8), (3, 27), (4, 64)] ● Sequences can be unpacked in list comprehension >>> sums = [x + y for x, y in cubes] >>> sums [0, 2, 10, 30, 68]
  • 54. List Comprehension ● for-clauses in list comprehensions can iterate over any sequences: >>> rslt = [ c * n for c in 'math' for n in (1, 2, 3)] >>> rslt ['m', 'mm', 'mmm', 'a', 'aa', 'aaa', 't', 'tt','ttt', 'h', 'hh', 'hhh']
  • 55. List Comprehension & Loop Variables ● The loop variables used in the list comprehension for-loops (and in regular for-loops) stay after the execution. >>> for i in [1, 2, 3]: print i 1 2 3 >>> i + 4 7 >>> [j for j in xrange(10) if j % 2 == 0] [0, 2, 4, 6, 8] >>> j * 2 18
  • 56. When To Use List Comprehension ● For-loops are easier to understand and debug ● List comprehensions may be harder to understand ● ● ● List comprehensions are faster than for-loops in the interpreter List comprehensions are worth using to speed up simpler tasks For-loops are worth using when logic gets complex
  • 57. Reading & References ● www.python.org ● http://docs.python.org/library/stdtypes.html#typesseq ● doc.python.org/howto/unicode.html ● ● ● Ch 02, M. L. Hetland. Beginning Python From Novice to Professional, 2nd Ed., APRESS Ch 02, H. Abelson and G. Sussman. Structure and Interpretation of Computer Programs, MIT Press S. Roman, Coding and Information Theory, Springer-Verlag