3. Python Dictionary
Dictionary abstraction provides a lookup table.
Each entry in a dictionary is a
<key, value>
pair. The key must be an immutable object.
The value can be anything.
dictionary[key] evaluates to the value associated
with key. Running time is approximately
constant! Recall: Class 9 on consistent hashing. Python
dictionaries use (inconsistent) hashing to make
lookups nearly constant time.
4. Dictionary Example
Create a new, empty dictionary
>>> d = {}
>>> d['UVa'] = 1818 Add an entry: key ‘UVa’, value 1818
>>> d['UVa'] = 1819 Update the value: key ‘UVa’, value 1819
>>> d['Cambridge'] = 1209
>>> d['UVa']
1819
>>> d['Oxford']
Traceback (most recent call last):
File "<pyshell#93>", line 1, in <module>
d['Oxford']
KeyError: 'Oxford'
7. Histogramming
Define a procedure histogram that takes a text
string as its input, and returns a dictionary that
maps each word in the input text to the
number of occurrences in the text.
Useful string method: split([separator])
outputs a list of the words in the string
>>> 'here we go'.split()
['here', 'we', 'go']
>>> "Simula, Nygaard and Dahl, Norway, 1962".split(",")
['Simula', ' Nygaard and Dahl', ' Norway', ' 1962']
8. >>> histogram("""
"Mathematicians stand on each
others' shoulders and computer
def histogram(text): scientists stand on each others' toes."
Richard Hamming""")
d = {} {'and': 1, 'on': 2, 'shoulders':
1, 'computer': 1, 'Richard':
words = text.split() 1, 'scientists': 1, "others'": 2, 'stand':
2, 'Hamming': 1, 'each':
2, '"Mathematicians': 1, 'toes."': 1}
8
9. >>> declaration =
def histogram(text): urllib.urlopen('http://www.cs.virginia.edu/cs11
d = {} 20/readings/declaration.html').read()
>>> histogram(declaration)
words = text.split() {'government,': 1, 'all':
for w in words: 11, 'forbidden': 1, '</title>':
1, '1776</b>': 1, 'hath': 1, 'Caesar':
if w in d: 1, 'invariably': 1, 'settlement':
d[w] = d[w] + 1 1, 'Lee,': 2, 'causes': 1, 'whose':
2, 'hold': 3, 'duty,': 1, 'ages,':
else: 2, 'Object': 1, 'suspending': 1, 'to':
d[w] = 1 66, 'present': 1, 'Providence,':
1, 'under': 1, '<dd>For': 9, 'should.':
return d 1, 'sent': 1, 'Stone,': 1, 'paralleled':
1, …
10. Sorting the Histogram
Expression ::= lambda Parameters : Expression sorted(collection, cmp)
Returns a new sorted list of the
Makes a procedure, just like Scheme’s
lambda (instead of listing parameters in
elements in collection ordered by cmp.
(), separate with :)
cmp specifies a comparison function of
two arguments which should return a
>>> sorted([1,5,3,2,4],<) negative, zero or positive number
SyntaxError: invalid syntax
depending on whether the first
>>> <
argument is considered smaller
SyntaxError: invalid syntax
>>> sorted([1,5,3,2,4], lambda a, b: a > b) than, equal to, or larger than the
[1, 5, 3, 2, 4] second argument.
>>> sorted([1,5,3,2,4], lambda a, b: a - b)
[1, 2, 3, 4, 5]
10
11. Showing the Histogram
def show_histogram(d):
keys = d.keys()
okeys = sorted(keys,
for k in okeys:
print str(k) + ": " + str(d[k])
12. Showing the Histogram
def show_histogram(d):
keys = d.keys()
okeys = sorted(keys,
lambda k1, k2: d[k2] - d[k1])
for k in okeys:
print str(k) + ": " + str(d[k])
13. Author Fingerprinting
(aka Plagarism Detection)
“The program identifies phrases of three words
or more in an author’s known work and searches
for them in unattributed plays. In tests where
authors are known to be different, there are up
to 20 matches because some phrases are in
common usage. When Edward III was tested
against Shakespeare’s works published before
1596 there were 200 matches.”
The Times, 12 October 2009
14. def histogram(text):
d = {}
words = text.split()
for w in words:
if w in d:
d[w] = d[w] + 1
def phrase_collector(text, plen):
else:
d = {}
d[w] = 1
words = text.split()
return d
words = map(lambda s: s.lower(), words)
for windex in range(0, len(words) - plen):
phrase = tuple(words[windex:windex+plen])
if phrase in d: Dictionary keys must be
d[phrase] = d[phrase] + 1 immutable: convert the
(mutable) list to an
else: immutable tuple.
d[phrase]= 1
return d
15. def common_phrases(d1, d2):
keys = d1.keys()
common = {}
for k in keys:
if k in d2:
common[k] = (d1[k], d2[k])
return common
myhomepage = urllib.urlopen('http://www.cs.virginia.edu/evans/index.html').read()
declaration = urllib.urlopen('http://www.cs.virginia.edu/cs1120/readings/declaration.html').read()
>>> ptj = phrase_collector(declaration, 3)
>>> ptj
{('samuel', 'adams,', 'john'): 1, ('to', 'pass', 'others'):
1, ('absolute', 'despotism,', 'it'): 1, ('a', 'firm', 'reliance'):
1, ('with', 'his', 'measures.'): 1, ('are', 'his.', '<p>'):
1, ('the', 'ruler', 'of'): 1, …
>>> pde = phrase_collector(myhomepage, 3)
>>> common_phrases(ptj, pde)
{('from', 'the', '<a'): (1, 1)}
18. Computing in World War II
Cryptanalysis (Lorenz: Collossus at Bletchley
Park, Enigma: Bombes at Bletchley, NCR in US)
Ballistics Tables, calculations for Hydrogen
bomb (ENIAC at U. Pennsylvania)
Batch processing: submit a program and its
data, wait your turn, get a result
Building a flight simulator required a different type of computing:
interactive computing
19. Pre-History:
MIT’s Project Whirlwind (1947-1960s)
Jay Forrester
20. Whirlwind Innovations
Magnetic Core Memory
(first version used vacuum tubes) IBM 704 (used by John McCarthy to
create LISP) commercialized this
22. Short or Endless Golden Age of
Nuclear Weapons?
60000
Tsar Bomba (50 Mt, largest ever = 10x all of WWII)
50000
40000
30000
kilotons
20000
First H-Bomb (10Mt) B83 (1.2Mt), largest
10000 in currently active arsenal
0
1940 1950 1960 1970 1980 1990 2000 2010 2020
Hiroshima (12kt), Nagasaki (20kt)
27. Objects in Sketchpad
In the process of making the Sketchpad system operate, a few very general
functions were developed which make no reference at all to the specific types
of entities on which they operate. These general functions give the Sketchpad
system the ability to operate on a wide range of problems. The motivation for
making the functions as general as possible came from the desire to get as much
result as possible from the programming effort involved. For example, the general
function for expanding instances makes it possible for Sketchpad to handle any
fixed geometry subpicture. The rewards that come from implementing general
functions are so great that the author has become reluctant to write any
programs for specific jobs. Each of the general functions implemented in the
Sketchpad system abstracts, in some sense, some common property of pictures
independent of the specific subject matter of the pictures themselves.
Ivan Sutherland,
Sketchpad: a Man-Machine Graphical Communication System, 1963
28. Simula
Considered the first
“object-oriented”
programming language
Language designed for
simulation by Kristen
Nygaard and Ole-Johan
Dahl (Norway, 1962)
Had special syntax for
defining classes that
packages state and
procedures together
29. Counter in Simula
class counter;
integer count;
begin
procedure reset(); count := 0; end;
procedure next();
count := count + 1; end;
integer procedure current();
current := count; end;
end
Does this have everything we need for
“object-oriented programming”?
30. Object-Oriented Programming
Object-Oriented Programming is a state of mind
where you program by thinking about objects
It is difficult to reach that state of mind if your
language doesn’t have mechanisms for
packaging state and procedures (Python has
class, Scheme has lambda expressions)
Other things can help: dynamic
dispatch, inheritance, automatic memory
management, mixins, good donuts, etc.
31. Charge
Monday: continue OOP history
PS6 Due Monday
Next week:
PS7 Due next Monday: only one week!
building a (mini) Scheme interpreter
(in Python and Java)
Reminder: Peter has office hours now!
(going over to Rice)
31