5. Symbol Encoding
1. Given a symbol s and a Huffman tree ht, set current_node to the root
node and encoding to an empty list (you can also check if s is in the root
node's symbol leaf and, if not, signal error)
2. If current_node is a leaf, return encoding
3. Check if s is in current_node's left branch or right branch
4. If in the left, add 0 to encoding, set current_node to the root of the left
branch, and go to step 2
5. If in the right, add 1 to encoding, set current_node to the root of the
right branch, and go to step 2
6. If in neither branch, signal error
6. Example
●
Encode B with the sample Huffman tree
●
Set current_node to the root node
●
●
●
●
B is in current_node's the right branch, so add 1 to encoding &
recurse into the right branch (current_node is set to the root of the
right branch – {B, C, D, E, F, G, H}: 9)
B is in current_node's left branch, so add 0 to encoding and recurse into the left branch (current_node is {B, C, D}: 5)
B is in current_node's left branch, so add 0 to encoding & recurse
into the left branch (current_node is B: 3)
current_node is a leaf, so return 100 (value of encoding)
7. Message Encoding
●
●
●
Given a sequence of symbols message and a Huffman
tree ht
Concatenate the encoding of each symbol in message
from left to right
Return the concatenation of encodings
8. Example
●
Encode ABBA with the sample Huffman tree
●
Encoding for A is 0
●
Encoding for B is 100
●
Encoding for B is 100
●
Encoding for A is 0
●
Concatenation of encodings is 01001000
9. Message Decoding
1. Given a sequence of bits message and a Huffman tree ht, set current_node to
the root and decoding to an empty list
2. If current_node is a leaf, add its symbol to decoding and set current_node to
ht's root
3. If current_node is ht's root and message has no more bits, return decoding
4. If no more bits in message & current_node is not a leaf, signal error
5. If message's current bit is 0, set current_node to its left child, read the bit, & go
to step 2
6. If message's current bit is 1, set current_node to its right child, read the bit, &
go to step 2
10. Example
●
●
Decode 0100 with the sample Huffman tree
Read 0, go left to A:8 & add A to decoding and reset
current_node to the root
●
Read 1, go right to {B, C, D, E, F, G, H}: 9
●
Read 0, go left to {B, C, D}:5
●
Read 0, go left to B:3
●
Add B to decoding & reset current_node to the root
●
No more bits & current_node is the root, so return AB
12. Algorithm
●
●
●
●
Basic idea: Build the tree bottom up so that symbols with the smallest frequencies are farthest from the root
Given a sequence of nodes (initially single symbols and their frequencies),
find two nodes with the smallest frequencies and combine them into a new
node whose symbol list contains the symbols of the two nodes and whose
frequency is the sum of the frequencies of the two nodes
Remove the two combined nodes from the sequence and add the newly constructed node back to the sequence (note that the length of the sequence is
now reduced by 1)
Keep combining pairs of nodes in the above fashion until there is only one
node left in the sequence: this is the root of the Huffman tree
13. Example
●
●
Initial sequence: [A:4, B:2, C:1, D:1]
Find two nodes with the smallest frequencies and combine them into a
new node whose symbol list contains the symbols of the two nodes
and whose frequency is the sum of the frequencies of the two nodes
●
The nodes are C:1 and D:1
●
The new node is {C, D}:2
●
After removing C:1 and D:1 and adding {C, D}:2, the sequence becomes [A:4, B:2, {C, D}:2]
15. Example
●
●
Current sequence: [A:4, B:2, {C,D}:2]
Find two nodes with the smallest frequencies and combine them into a
new node whose symbol list contains the symbols of the two nodes
and whose frequency is the sum of the frequencies of the two nodes
●
The nodes are B:2 and {C, D}:2
●
The new node is {B, C, D}:4
●
After removing B:2 and {C, D}:2 and adding {B, C, D}:4, the sequence becomes [A:4, {B, C, D}:4]
17. Example
●
●
Current sequence: [A:4, {B,C,D}:4]
Find two nodes with the smallest frequencies and combine them into a
new node whose symbol list contains the symbols of the two nodes
and whose frequency is the sum of the frequencies of the two nodes
●
The nodes are A:4 and {B,C, D}:4
●
The new node is {A,B, C, D}:4
●
●
After removing A:4 and {B,C, D}:4 and adding {A,B, C, D}:8, the
sequence becomes [{A,B, C, D}:8]
We are done, because the sequence has only one node
20. Remarks on the Algorithm
●
●
●
The algorithm does not specify a unique Huffman tree, because there
may be more than two nodes in the sequence with the same frequencies
How these nodes are combined at each step (e.g., two rightmost
nodes, two leftmost nodes, two middle nodes) is arbitrary, and is left for
the programmer to decide
The algorithm does guarantee the same code lengths regardless of
which combination method is used
22. List Comprehension
●
●
List comprehension is a syntactic construct in some
programming languages for building lists from list specifications
List comprehension derives its conceptual roots from
the set-former (set-builder) notation in mathematics
[Y for X in LIST]
●
List comprehension is available in other programming
languages such as Common Lisp, Haskell, and Ocaml
23. Set-Former Notation Example
4 x | x N , x
100
4 x is the output function
x is the variable
N is the input set
2
x 100 is the predicate
2
24. For-Loop Implementation
### building the list of the set-former example with forloop
>>> rslt = []
>>> for x in xrange(201):
if x ** 2 < 100:
rslt.append(4 * x)
>>> rslt
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
25. List Comprehension Equivalent
### building the same list with list comprehension
>>> s = [ 4 * x for x in xrange(201) if x ** 2 < 100]
>>> s
[0, 4, 8, 12, 16, 20, 24, 28, 32, 36]
26. For-Loop
### building list of squares of even numbers in [0, 10]
### with for-loop
>>> rslt = []
>>> for x in xrange(11):
if x % 2 == 0:
rslt.append(x**2)
>>> rslt
[0, 4, 16, 36, 64, 100]
27. List Comprehension Equivalent
### building the same list with list comprehension
>>> [x ** 2 for x in xrange(11) if x % 2 == 0]
[0, 4, 16, 36, 64, 100]
28. For-Loop
## building list of squares of odd numbers in [0,
10]
>>> rslt = []
>>> for x in xrange(11):
if x % 2 != 0:
rslt.append(x**2)
>>> rslt
[1, 9, 25, 49, 81]
29. List Comprehension Equivalent
## building list of squares of odd numbers [0, 10]
## with list comprehension
>>> [x ** 2 for x in xrange(11) if x % 2 != 0]
[1, 9, 25, 49, 81]
31. For-Loop
>>> rslt = []
>>> for x in xrange(6):
if x % 2 == 0:
for y in xrange(6):
if y % 2 != 0:
rslt.append((x, y))
>>> rslt
[(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4,
3), (4, 5)]
32. List Comprehension Equivalent
>>> [(x, y) for x in xrange(6) if x % 2 == 0
for y in xrange(6) if y % 2 != 0]
[(0, 1), (0, 3), (0, 5), (2, 1), (2, 3), (2, 5), (4, 1), (4,
3), (4, 5)]
34. List Comprehension with Matrices
●
List comprehension can be used to scan rows and columns in matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract all rows
>>> [r for r in matrix]
[[10, 20, 30], [40, 50, 60], [70, 80, 90]]
35. List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 0
>>> [r[0] for r in matrix]
[10, 40, 70]
36. List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 1
>>> [r[1] for r in matrix]
[20, 50, 80]
37. List Comprehension with Matrices
>>> matrix = [
[10, 20, 30],
[40, 50, 60],
[70, 80, 90]
]
### extract column 2
>>> [r[2] for r in matrix]
[30, 60, 90]
38. List Comprehension with Matrices
### turn matrix columns into rows
>>> rslt = []
>>> for c in xrange(len(matrix)):
rslt.append([matrix[r][c]
xrange(len(matrix))])
for
>>> rslt
[[10, 40, 70], [20, 50, 80], [30, 60, 90]]
r
in
39. List Comprehension with Matrices
●
List comprehension can work with iterables (e.g., dictionaries)
>>> dict = {'a' : 'A', 'bb' : 'BB', 'ccc' : 'CCC'}
>>> [(item[0], item[1], len(item[0]+item[1]))
for item in dict.items()]
[('a', 'A', 2), ('ccc', 'CCC', 6), ('bb', 'BB', 4)]
40. List Comprehension
●
If the expression inside [ ] is a tuple, parentheses are a must
>>> cubes = [(x, x**3) for x in xrange(5)]
>>> cubes
[(0, 0), (1, 1), (2, 8), (3, 27), (4, 64)]
●
Sequences can be unpacked in list comprehension
>>> sums = [x + y for x, y in cubes]
>>> sums
[0, 2, 10, 30, 68]
41. List Comprehension
●
for-clauses in list comprehensions can iterate over
any sequences:
>>> rslt = [ c * n for c in 'math' for n in (1, 2,
3)]
>>> rslt
['m', 'mm', 'mmm', 'a', 'aa', 'aaa', 't', 'tt','ttt', 'h',
'hh', 'hhh']
42. List Comprehension & Loop Variables
●
The loop variables used in the list comprehension for-loops
(and in regular for-loops) stay after the execution.
>>> for i in [1, 2, 3]: print i
1
2
3
>>> i + 4
7
>>> [j for j in xrange(10) if j % 2 == 0]
[0, 2, 4, 6, 8]
>>> j * 2
18
43. When To Use List Comprehension
●
For-loops are easier to understand and debug
●
List comprehensions may be harder to understand
●
●
●
List comprehensions are faster than for-loops in the interpreter
List comprehensions are worth using to speed up simpler
tasks
For-loops are worth using when logic gets complex
45. Classes vs. Object
●
●
●
A class is a definition (blueprint, description) of
states and behaviors of objects that belong to it
An object is a member of its class that
behaves according to its class blueprint
Objects of a class are also called instances of
that class
46. Older Python: Classes vs. Types
●
●
●
●
In older versions of Python, there was a
difference between classes and types
The programmer could create classes but not
types
In newer versions of Python, the distinction
between types and classes is disappearing
The programmer can now make subclasses of
built-in types and the types are behaving like
classes
47. Older Python: Classes vs. Types
●
●
In Python versions prior to Python 3.0, old style
classes are default
To get the new style classes, place
__metaclass__ = type at the beginning of a script or
a module
●
●
There is no reason to use old style classes any more
(unless there is a serious backward compatibility
issue).
Python 3.0 and higher do not support old style
classes
50. Class Definition Evaluation
●
●
●
●
When a class definition is evaluated, a new
namespace is created and used as the local
scope
All assignments of local variables occur in that
new namespace
Function definitions bind function names in that
new namespace
When a class definition is exited, a class object
is created
51. class Statement
●
class statement defines a named class
●
class statements can be placed inside functions
●
Multiple classes can be defined in one .py file
●
Class definition must have at least one statement in
its body (pass can be used as a placeholder)
52. Class Documentation
●
To document a class, place a docstring immediately after the
class statement
class <ClassName>:
"""
Does nothing for the moment
"""
pass
53. Creating Objects
●
There is no new in Python
●
Class objects (instances) are created by the class name followed by ()
●
This object creation process is called class instantiation:
class SimplePrinter:
"""
This is class Printer.
"""
pass
>>> x = Printer()
54. Operations Supported by Class Objects
●
Class objects support two types of operations: attribute
reference and instantiation
__metaclass__ = type
class A:
''' this is class A. '''
x = 12
def g(self):
return 'Hello from A!'
55. Class Objects
>>> A.x
## attribute reference
>>> A.g
## attribute reference
>>> A.__doc__ ## attribute reference
>>> a = A() ## a is an instance of
## class A (instantiation)
56. Defining Class Methods
●
●
●
In C++ terminology, all class members are public and all class methods
are virtual
All class methods are defined with def and must have the parameter
self as their first argument
One can think of self as this in Java and C++
class SimplePrinter:
def println(self):
print
def print_obj(self, obj):
print obj,
57. Calling Methods on Instances
●
To call a method on an instance, use the dot operator
●
Do not put self as the first argument
>>> sp = SimplePrinter()
>>> sp.print_obj([1, 2]); sp.println()
58. Calling Methods on Instances
●
What happens to self in sp.println()?
●
The definition inside the SimplePrinter class is
def println(self):
print
●
How come self is not the first argument?
59. Calling Methods on Instances
●
The statement sp.println() is converted to
SimplePrinter.println(sp) so self is bound to sp
●
●
In general, suppose there is a class C with a method
f(self, x1, ..., xn)
Suppose we do:
>>> x = C()
>>> x.f(v1, ..., vn)
●
Then x.f(v1, ..., vn) is converted to C.f(x, v1, ..., vn)
60. Example
class C:
def f(self, x1, x2, x3):
return [x1, x2, x3]
>>> x = C()
>>> x.f(1, 2, 3)
[1, 2, 3]
>>> C.f(x, 1, 2, 3)
3)
[1, 2, 3]
## equivalent to x.f(1, 2,
61. Attributes and Attribute References
●
The term attribute is used for any name that follows a dot
●
For example, in the expression “a.x”, x is an attribute
class A:
""" This is class A. """
def printX(self):
print self._x,
●
A.__doc__, A._x, A.printX, A._list are valid attribute references
62. Types of Attribute Names
●
There are two types of attribute names: data attributes and
method attributes
class A:
""" This is class A. """
_x
=0
## data attribute _x
_list = [] ## data attribute _list
def printX(self): ## method attribute
print self._x,
●
A.__doc__, A._x, A._list are data attributes
●
A.printX is a method attribute
63. Data Attributes
●
●
●
●
Data attributes loosely correspond to data members
in C++
A data attribute does not have to be explicitly
declared in the class definition
A data attribute begins to exist when it is first
assigned to
Of course, integrating data attributes into the class
definition makes the code easier to read and debug
64. Data Attributes
●
This code illustrates that attributes do not have to be declared and
begin their existence when they are first assigned to
class B:
""" This is class B. """
def __init__(self):
self._number = 10
self._list = [1, 2, 3]
>>> b = B()
>>> b._number
10
>>> b._list
[1, 2, 3]
65. Method Attributes
●
●
●
●
Method attributes loosely correspond to data member
functions in C++
A method is a function that belongs to a class
If a is an object of class A, then a.printX is a
method object
Like function objects, method objects can be used
outside of their classes, e.g. assigned to variables
and called at some later point
66. Method Attributes
●
Method attributes loosely correspond to data member functions in C++
class A:
_x = 0
def printX(self):
print self._x,
>>> a = A()
>>> m = a.printX
>>> m()
0
>>> a._x = 20
>>> m()
20
67. Method Attributes
●
Data attributes override method attributes
class C:
def f(self):
print "I am a C object."
>>> c = C()
>>> c.f()
I am a C object.
>>> c.f = 10
>>> c.f
10
>>> c.f() ### error
68. Method Attributes
●
●
●
Consistent naming conventions help avoid clashes between data
attributes and method attributes
Choosing a naming convention and using it consistently makes reading
and debugging code much easier
Some naming conventions:
First letter in data attributes is lower case; first letter in method
attributes is upper case
First letter in data attributes is underscore; first letter in method
attributes is not underscore
Nouns are used for data attributes; verbs are used for methods
69. Reading & References
●
●
●
●
●
www.python.org
Ch 02, H. Abelson and G. Sussman. Structure and Interpretation of Computer Programs, MIT Press
S. Roman, Coding and Information Theory, Springer-Verlag
Ch 03, M. L. Hetland. Beginning Python From Novice to Professional, 2nd Ed., APRESS
Ch 04, M. L. Hetland. Beginning Python From Novice to Professional, 2nd Ed., APRESS