SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
a taste of

Presented by Jordan Baker
    October 23, 2009
    DevDays Toronto
About Me

• Open Source Developer
• Founder of Open Source Web Application
  and CMS service provider: Scryent -
  www.scryent.com
• Founder of Toronto Plone Users Group -
  www.torontoplone.ca
Agenda

• About Python
• Show me your CODE
• A Spell Checker in 21 lines of code
• Why Python ROCKS
• Resources for further exploration
About Python




http://www.flickr.com/photos/schoffer/196079076/
About Python


• Gotta love a language named after Monty
  Python’s Flying Circus
• Used in more places than you might know
Significant Whitespace
C-like

if(x == 2) {
    do_something();
}
do_something_else();

Python

if x == 2:
    do_something()
do_something_else()
Significant Whitespace

• less code clutter
• eliminates many common syntax errors
• proper code layout
• use an indentation aware editor or IDE
• Get over it!
Python is Interactive

Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for
more information.
>>>
FIZZ BUZZ
1
2
FIZZ
4
BUZZ
...
14
FIZZ BUZZ
FIZZ BUZZ
    def fizzbuzz(n):
      for i in range(n + 1):
          if not i % 3:
              print "Fizz",
          if not i % 5:
              print "Buzz",
          if i % 3 and i % 5:
              print i,
          print

fizzbuzz(50)
FIZZ BUZZ
    def fizzbuzz(n):
      for i in range(n + 1):
          if not i % 3:
              print "Fizz",
          if not i % 5:
              print "Buzz",
          if i % 3 and i % 5:
              print i,
          print

fizzbuzz(50)
FIZZ BUZZ (OO)
   class FizzBuzzWriter(object):
    def __init__(self, limit):
        self.limit = limit
       
    def run(self):
        for n in range(1, self.limit + 1):
            self.write_number(n)
   
    def write_number(self, n):
        if not n % 3:
            print "Fizz",
        if not n % 5:
            print "Buzz",
        if n % 3 and n % 5:
            print n,
        print
       
fizzbuzz = FizzBuzzWriter(50)
fizzbuzz.run()
A Spell Checker in 21
   Lines of Code
• Written by Peter Norvig
• Duplicated in many languages
• Simple Spellchecking algorithm based on
  probability
• http://norvig.com/spell-correct.html
The Approach
•   Census by frequency

•   Morph the word (werd)

    •   Insertions: waerd, wberd, werzd

    •   Deletions: wrd, wed, wer

    •   Transpositions: ewrd, wred, wedr

    •   Replacements: aerd, ward, wbrd, word, wzrd,
        werz

•   Find the one with the highest frequency: were
Norvig Spellchecker
import re, collections

def words(text):
    return re.findall('[a-z]+', text.lower())

def train(words):
    model = collections.defaultdict(int)
    for w in words:
       model[w] += 1
    return model

NWORDS = train(words(file('big.txt').read()))

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
    deletes    = [a + b[1:] for a, b in s if b]
    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]
    inserts    = [a + c + b     for a, b in s for c in alphabet]
    return set(deletes + transposes + replaces + inserts)

def known_edits2(word):
    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

def known(words):
    return set(w for w in words if w in NWORDS)

def correct(word):
    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
    return max(candidates, key=NWORDS.get)
Regular Expressions

def words(text):
    return re.findall('[a-z]+', text.lower())

>>> words("The cat in the hat!")
['the', 'cat', 'in', 'the', 'hat']
Dictionaries
>>> d = {'cat':1}
>>> d
{'cat': 1}
>>> d['cat']
1

>>> d['cat'] += 1
>>> d
{'cat': 2}

>>> d['dog'] += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'dog' 
defaultdict
# Has a factory for missing keys
>>> d = collections.defaultdict(int)
>>> d['dog'] += 1
>>> d
{'dog': 1}

>>> int
<type 'int'>
>>> int()
0

def train(words):
   model = collections.defaultdict(int)
   for w in words:
       model[w] += 1
   return model

>>> train(words("The cat in the hat!"))
{'cat': 1, 'the': 2, 'hat': 1, 'in': 1}              
Reading the File
     >>> text = file('big.txt').read()
     >>> NWORDS = train(words(text))
     >>> NWORDS
     {'nunnery': 3, 'presnya': 1, 'woods': 22, 'clotted': 1, 'spiders': 1,
     'hanging': 42, 'disobeying': 2, 'scold': 3, 'originality': 6,
     'grenadiers': 8, 'pigment': 16, 'appropriation': 6, 'strictest': 1,
     'bringing': 48, 'revelers': 1, 'wooded': 8, 'wooden': 37,
     'wednesday': 13, 'shows': 50, 'immunities': 3, 'guardsmen': 4,
     'sooty': 1, 'inevitably': 32, 'clavicular': 9, 'sustaining': 5,
     'consenting': 1, 'scraped': 21, 'errors': 16, 'semicircular': 1,
     'cooking': 6, 'spiroch': 25, 'designing': 1, 'pawed': 1,
     'succumb': 12, 'shocks': 1, 'crouch': 2, 'chins': 1, 'awistocwacy': 1,
     'sunbeams': 1, 'perforations': 6, 'china': 43, 'affiliated': 4,
     'chunk': 22, 'natured': 34, 'uplifting': 1, 'slaveholders': 2,
     'climbed': 13, 'controversy': 33, 'natures': 2, 'climber': 1,
     'lency': 2, 'joyousness': 1, 'reproaching': 3, 'insecurity': 1,
     'abbreviations': 1, 'definiteness': 1, 'music': 56, 'therefore': 186,
     'expeditionary': 3, 'primeval': 1, 'unpack': 1, 'circumstances': 107,
     ... (about 6500 more lines) ...

     >>> NWORDS['the']
     80030
     >>> NWORDS['unusual']
     32
     >>> NWORDS['cephalopod']
     0
Training the Probability
         Model
import re, collections

def words(text):
    return re.findall('[a-z]+', text.lower())

def train(words):
    model = collections.defaultdict(int)
    for w in words:
        model[w] += 1
    return model

NWORDS = train(words(file('big.txt').read()))
List Comprehensions
# These two are equivalent:

result = []
for v in iter:
    if cond:
        result.append(expr)


[ expr for v in iter if cond ]


# You can nest loops also:

result = []
for v1 in iter1:
    for v2 in iter2:
        if cond:
            result.append(expr)


[ expr for v1 in iter1 for v2 in iter2 if cond ]


 
String Slicing
>>> word = "spam"
>>> word[:1]
's'
>>> word[1:]
'pam'

>>> (word[:1], word[1:])
('s', 'pam')

>>> range(len(word) + 1)
[0, 1, 2, 3, 4]

>>> [(word[:i], word[i:]) for i in range(len(word) + 1)]
[('', 'spam'), ('s', 'pam'), ('sp', 'am'), ('spa', 'm'),
('spam', '')]
Deletions
>>> word = "spam"
>>> s = [(word[:i], word[i:]) for i in range(len(word) + 1)]

>>> deletes = [a + b[1:] for a, b in s if b]

>>> deletes
['pam', 'sam', 'spm', 'spa']

>>> a, b = ('s', 'pam')
>>> a
's'
>>> b
'pam'

>>> bool('pam')
True
>>> bool('')
False
Transpositions

For example: teh => the

>>> transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]

>>> transposes
['psam', 'sapm', 'spma']
Replacements

>>> alphabet = "abcdefghijklmnopqrstuvwxyz"

>>> replaces = [a + c + b[1:]  for a, b in s for c in alphabet if b]
>>> replaces
['apam', 'bpam', ..., 'zpam', 'saam', ..., 'szam', ..., 'spaz']
Insertion

>>> alphabet = "abcdefghijklmnopqrstuvwxyz"

>>> inserts = [a + c + b  for a, b in s for c in alphabet]
>>> inserts
['aspam', ..., 'zspam', 'sapam', ..., 'szpam', 'spaam', ..., 'spamz']
Find all Edits
alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
    deletes = [a + b[1:] for a, b in s if b]
    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
    replaces = [a + c + b[1:] for a, b in s for c in alphabet if b]
    inserts = [a + c + b  for a, b in s for c in alphabet]
    return set(deletes + transposes + replaces + inserts)

>>> edits1("spam")
set(['sptm', 'skam', 'spzam', 'vspam', 'spamj', 'zpam', 'sbam',
'spham', 'snam', 'sjpam', 'spma', 'swam', 'spaem', 'tspam', 'spmm',
'slpam', 'upam', 'spaim', 'sppm', 'spnam', 'spem', 'sparm', 'spamr',
'lspam', 'sdpam', 'spams', 'spaml', 'spamm', 'spamn', 'spum',
'spamh', 'spami', 'spatm', 'spamk', 'spamd', ..., 'spcam', 'spamy'])
Known Words
def known(words):
       """ Return the known words from `words`. """
       return set(w for w in words if w in NWORDS)
Correct
def known(words):
    """ Return the known words from `words`. """
    return set(w for w in words if w in NWORDS)

def correct(word):
    candidates = known([word]) or known(edits1(word)) or [word]
    return max(candidates, key=NWORDS.get)

>>> bool(set([]))
False

>>> correct("computr")
'computer'

>>> correct("computor")
'computer'

>>> correct("computerr")
'computer'
Edit Distance 2
def known_edits2(word):
    return set(
        e2
            for e1 in edits1(word)
                for e2 in edits1(e1)
                    if e2 in NWORDS
        )

def correct(word):
    candidates = known([word]) or known(edits1(word)) or 
        known_edits2(word) or [word]
    return max(candidates, key=NWORDS.get)

>>> correct("conpuler")
'computer'
>>> correct("cmpuler")
'computer'
import re, collections

def words(text):
    return re.findall('[a-z]+', text.lower())

def train(words):
    model = collections.defaultdict(int)
    for w in words:
       model[w] += 1
    return model

NWORDS = train(words(file('big.txt').read()))

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
    deletes    = [a + b[1:] for a, b in s if b]
    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]
    inserts    = [a + c + b     for a, b in s for c in alphabet]
    return set(deletes + transposes + replaces + inserts)

def known_edits2(word):
    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

def known(words):
    return set(w for w in words if w in NWORDS)

def correct(word):
    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
    return max(candidates, key=NWORDS.get)
Comparing Python &
    Java Versions

• http://raelcunha.com/spell-correct.php
• 35 lines of Java
import java.io.*;
import java.util.*;
import java.util.regex.*;


class Spelling {

"   private final HashMap<String, Integer> nWords = new HashMap<String, Integer>();

"   public Spelling(String file) throws IOException {
"   "    BufferedReader in = new BufferedReader(new FileReader(file));
"   "    Pattern p = Pattern.compile("w+");
"   "    for(String temp = ""; temp != null; temp = in.readLine()){
"   "    "    Matcher m = p.matcher(temp.toLowerCase());
"   "    "    while(m.find()) nWords.put((temp = m.group()), nWords.containsKey(temp) ? nWords.get(temp) + 1 : 1);
"   "    }
"   "    in.close();
"   }

"    private final ArrayList<String> edits(String word) {
"    "    ArrayList<String> result = new ArrayList<String>();
"    "    for(int i=0; i < word.length(); ++i) result.add(word.substring(0, i) + word.substring(i+1));
"    "    for(int i=0; i < word.length()-1; ++i) result.add(word.substring(0, i) + word.substring(i+1, i+2) +
word.substring(i, i+1) + word.substring(i+2));
"    "    for(int i=0; i < word.length(); ++i) for(char c='a'; c <= 'z'; ++c) result.add(word.substring(0, i) +
String.valueOf(c) + word.substring(i+1));
"    "    for(int i=0; i <= word.length(); ++i) for(char c='a'; c <= 'z'; ++c) result.add(word.substring(0, i) +
String.valueOf(c) + word.substring(i));
"    "    return result;
"    }

"   public final String correct(String word) {
"   "    if(nWords.containsKey(word)) return word;
"   "    ArrayList<String> list = edits(word);
"   "    HashMap<Integer, String> candidates = new HashMap<Integer, String>();
"   "    for(String s : list) if(nWords.containsKey(s)) candidates.put(nWords.get(s),s);
"   "    if(candidates.size() > 0) return candidates.get(Collections.max(candidates.keySet()));
"   "    for(String s : list) for(String w : edits(s)) if(nWords.containsKey(w)) candidates.put(nWords.get(w),w);
"   "    return candidates.size() > 0 ? candidates.get(Collections.max(candidates.keySet())) : word;
"   }

"   public static void main(String args[]) throws IOException {
"   "    if(args.length > 0) System.out.println((new Spelling("big.txt")).correct(args[0]));
"   }

}
import re, collections

def words(text):
    return re.findall('[a-z]+', text.lower())

def train(words):
    model = collections.defaultdict(int)
    for w in words:
       model[w] += 1
    return model

NWORDS = train(words(file('big.txt').read()))

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
    deletes    = [a + b[1:] for a, b in s if b]
    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]
    inserts    = [a + c + b     for a, b in s for c in alphabet]
    return set(deletes + transposes + replaces + inserts)

def known_edits2(word):
    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

def known(words):
    return set(w for w in words if w in NWORDS)

def correct(word):
    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
    return max(candidates, key=NWORDS.get)
IDE for Python

• IDE’s for Python include:
 • PyDev for Eclipse
 • WingIDE
 • IDLE for Windows/ Linux/ Mac
 • there’s more
Why Python ROCKS
• Elegant and readable language - “Executable
  Pseudocode”
• Standard Libraries - “Batteries Included”
• Very High level Datatypes
• Dynamically Typed
• It’s FUN!
An Open Source
       Community

• Projects: Plone, Zope, Grok, BFG, Django,
  SciPy & NumPy, Google App Engine,
  PyGame
• PyCon
Resources
• PyGTA
• Toronto Plone Users
• Toronto Django Users
• Stackoverflow
• Dive into Python
• Python Tutorial
Thanks

• I’d love to hear your questions or
  comments on this presentation. Reach me
  at:
  • jbb@scryent.com
  • http://twitter.com/hexsprite

Weitere ähnliche Inhalte

Was ist angesagt?

Descobrindo a linguagem Perl
Descobrindo a linguagem PerlDescobrindo a linguagem Perl
Descobrindo a linguagem Perlgarux
 
An (Inaccurate) Introduction to Python
An (Inaccurate) Introduction to PythonAn (Inaccurate) Introduction to Python
An (Inaccurate) Introduction to PythonNicholas Tollervey
 
Functional Pe(a)rls version 2
Functional Pe(a)rls version 2Functional Pe(a)rls version 2
Functional Pe(a)rls version 2osfameron
 
Introdução ao Perl 6
Introdução ao Perl 6Introdução ao Perl 6
Introdução ao Perl 6garux
 
The Error of Our Ways
The Error of Our WaysThe Error of Our Ways
The Error of Our WaysKevlin Henney
 
Groovy puzzlers jug-moscow-part 2
Groovy puzzlers jug-moscow-part 2Groovy puzzlers jug-moscow-part 2
Groovy puzzlers jug-moscow-part 2Evgeny Borisov
 
Functional Pe(a)rls - the Purely Functional Datastructures edition
Functional Pe(a)rls - the Purely Functional Datastructures editionFunctional Pe(a)rls - the Purely Functional Datastructures edition
Functional Pe(a)rls - the Purely Functional Datastructures editionosfameron
 
Pre-Bootcamp introduction to Elixir
Pre-Bootcamp introduction to ElixirPre-Bootcamp introduction to Elixir
Pre-Bootcamp introduction to ElixirPaweł Dawczak
 
Palestra sobre Collections com Python
Palestra sobre Collections com PythonPalestra sobre Collections com Python
Palestra sobre Collections com Pythonpugpe
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsMichael Pirnat
 
Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?osfameron
 
第二讲 Python基礎
第二讲 Python基礎第二讲 Python基礎
第二讲 Python基礎juzihua1102
 
第二讲 预备-Python基礎
第二讲 预备-Python基礎第二讲 预备-Python基礎
第二讲 预备-Python基礎anzhong70
 
RxSwift 시작하기
RxSwift 시작하기RxSwift 시작하기
RxSwift 시작하기Suyeol Jeon
 
Python tutorial
Python tutorialPython tutorial
Python tutorialnazzf
 
Ruby 程式語言簡介
Ruby 程式語言簡介Ruby 程式語言簡介
Ruby 程式語言簡介Wen-Tien Chang
 

Was ist angesagt? (20)

Descobrindo a linguagem Perl
Descobrindo a linguagem PerlDescobrindo a linguagem Perl
Descobrindo a linguagem Perl
 
An (Inaccurate) Introduction to Python
An (Inaccurate) Introduction to PythonAn (Inaccurate) Introduction to Python
An (Inaccurate) Introduction to Python
 
Functional Pe(a)rls version 2
Functional Pe(a)rls version 2Functional Pe(a)rls version 2
Functional Pe(a)rls version 2
 
Introdução ao Perl 6
Introdução ao Perl 6Introdução ao Perl 6
Introdução ao Perl 6
 
The Error of Our Ways
The Error of Our WaysThe Error of Our Ways
The Error of Our Ways
 
Python 1
Python 1Python 1
Python 1
 
Groovy puzzlers jug-moscow-part 2
Groovy puzzlers jug-moscow-part 2Groovy puzzlers jug-moscow-part 2
Groovy puzzlers jug-moscow-part 2
 
Functional Pe(a)rls - the Purely Functional Datastructures edition
Functional Pe(a)rls - the Purely Functional Datastructures editionFunctional Pe(a)rls - the Purely Functional Datastructures edition
Functional Pe(a)rls - the Purely Functional Datastructures edition
 
Pre-Bootcamp introduction to Elixir
Pre-Bootcamp introduction to ElixirPre-Bootcamp introduction to Elixir
Pre-Bootcamp introduction to Elixir
 
Palestra sobre Collections com Python
Palestra sobre Collections com PythonPalestra sobre Collections com Python
Palestra sobre Collections com Python
 
A Few of My Favorite (Python) Things
A Few of My Favorite (Python) ThingsA Few of My Favorite (Python) Things
A Few of My Favorite (Python) Things
 
Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?
 
第二讲 Python基礎
第二讲 Python基礎第二讲 Python基礎
第二讲 Python基礎
 
第二讲 预备-Python基礎
第二讲 预备-Python基礎第二讲 预备-Python基礎
第二讲 预备-Python基礎
 
RxSwift 시작하기
RxSwift 시작하기RxSwift 시작하기
RxSwift 시작하기
 
Five
FiveFive
Five
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
Ruby 程式語言簡介
Ruby 程式語言簡介Ruby 程式語言簡介
Ruby 程式語言簡介
 
CoffeeScript
CoffeeScriptCoffeeScript
CoffeeScript
 
PHP 5.4
PHP 5.4PHP 5.4
PHP 5.4
 

Andere mochten auch

'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...
'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...
'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...Wellesley Institute
 
Plone i18n, LinguaPlone
Plone i18n, LinguaPlonePlone i18n, LinguaPlone
Plone i18n, LinguaPloneQuintagroup
 
Intro to Testing in Zope, Plone
Intro to Testing in Zope, PloneIntro to Testing in Zope, Plone
Intro to Testing in Zope, PloneQuintagroup
 
Plone Testing Tools And Techniques
Plone Testing Tools And TechniquesPlone Testing Tools And Techniques
Plone Testing Tools And TechniquesJordan Baker
 
Plone testingdzug tagung2010
Plone testingdzug tagung2010Plone testingdzug tagung2010
Plone testingdzug tagung2010Timo Stollenwerk
 
Plone TuneUp challenges
Plone TuneUp challengesPlone TuneUp challenges
Plone TuneUp challengesAndrew Mleczko
 

Andere mochten auch (7)

'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...
'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...
'Wicked' Policy Challenges: Tools, Strategies and Directions for Driving Ment...
 
Plone i18n, LinguaPlone
Plone i18n, LinguaPlonePlone i18n, LinguaPlone
Plone i18n, LinguaPlone
 
Intro to Testing in Zope, Plone
Intro to Testing in Zope, PloneIntro to Testing in Zope, Plone
Intro to Testing in Zope, Plone
 
Plone Testing Tools And Techniques
Plone Testing Tools And TechniquesPlone Testing Tools And Techniques
Plone Testing Tools And Techniques
 
Plone testingdzug tagung2010
Plone testingdzug tagung2010Plone testingdzug tagung2010
Plone testingdzug tagung2010
 
Plone TuneUp challenges
Plone TuneUp challengesPlone TuneUp challenges
Plone TuneUp challenges
 
Adobe Connect Audio Conference Bridge
Adobe Connect Audio Conference BridgeAdobe Connect Audio Conference Bridge
Adobe Connect Audio Conference Bridge
 

Ähnlich wie A Taste of Python - Devdays Toronto 2009

Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to PythonUC San Diego
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingMuthu Vinayagam
 
Snakes for Camels
Snakes for CamelsSnakes for Camels
Snakes for Camelsmiquelruizm
 
Super Advanced Python –act1
Super Advanced Python –act1Super Advanced Python –act1
Super Advanced Python –act1Ke Wei Louis
 
Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...
Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...
Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...Mozaic Works
 
Crystal presentation in NY
Crystal presentation in NYCrystal presentation in NY
Crystal presentation in NYCrystal Language
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesMatt Harrison
 
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPythonByterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPythonakaptur
 
Python tutorial
Python tutorialPython tutorial
Python tutorialRajiv Risi
 
Class 31: Deanonymizing
Class 31: DeanonymizingClass 31: Deanonymizing
Class 31: DeanonymizingDavid Evans
 
Slides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammersSlides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammersGiovanni924
 
Ruby 程式語言入門導覽
Ruby 程式語言入門導覽Ruby 程式語言入門導覽
Ruby 程式語言入門導覽Wen-Tien Chang
 
Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!Paige Bailey
 

Ähnlich wie A Taste of Python - Devdays Toronto 2009 (20)

Introduction to Python
Introduction to PythonIntroduction to Python
Introduction to Python
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Python Tidbits
Python TidbitsPython Tidbits
Python Tidbits
 
GE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python ProgrammingGE8151 Problem Solving and Python Programming
GE8151 Problem Solving and Python Programming
 
Snakes for Camels
Snakes for CamelsSnakes for Camels
Snakes for Camels
 
Super Advanced Python –act1
Super Advanced Python –act1Super Advanced Python –act1
Super Advanced Python –act1
 
Basics
BasicsBasics
Basics
 
An introduction to Ruby
An introduction to RubyAn introduction to Ruby
An introduction to Ruby
 
Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...
Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...
Stefan Kanev: Clojure, ClojureScript and Why They're Awesome at I T.A.K.E. Un...
 
Intro to Python
Intro to PythonIntro to Python
Intro to Python
 
python codes
python codespython codes
python codes
 
Crystal presentation in NY
Crystal presentation in NYCrystal presentation in NY
Crystal presentation in NY
 
Learn 90% of Python in 90 Minutes
Learn 90% of Python in 90 MinutesLearn 90% of Python in 90 Minutes
Learn 90% of Python in 90 Minutes
 
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPythonByterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
Byterun, a Python bytecode interpreter - Allison Kaptur at NYCPython
 
Python tutorial
Python tutorialPython tutorial
Python tutorial
 
Class 31: Deanonymizing
Class 31: DeanonymizingClass 31: Deanonymizing
Class 31: Deanonymizing
 
Slides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammersSlides chapter3part1 ruby-forjavaprogrammers
Slides chapter3part1 ruby-forjavaprogrammers
 
Ruby 程式語言入門導覽
Ruby 程式語言入門導覽Ruby 程式語言入門導覽
Ruby 程式語言入門導覽
 
Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!Python 101++: Let's Get Down to Business!
Python 101++: Let's Get Down to Business!
 

Kürzlich hochgeladen

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Kürzlich hochgeladen (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

A Taste of Python - Devdays Toronto 2009

  • 1. a taste of Presented by Jordan Baker October 23, 2009 DevDays Toronto
  • 2. About Me • Open Source Developer • Founder of Open Source Web Application and CMS service provider: Scryent - www.scryent.com • Founder of Toronto Plone Users Group - www.torontoplone.ca
  • 3. Agenda • About Python • Show me your CODE • A Spell Checker in 21 lines of code • Why Python ROCKS • Resources for further exploration
  • 5. About Python • Gotta love a language named after Monty Python’s Flying Circus • Used in more places than you might know
  • 6. Significant Whitespace C-like if(x == 2) { do_something(); } do_something_else(); Python if x == 2: do_something() do_something_else()
  • 7. Significant Whitespace • less code clutter • eliminates many common syntax errors • proper code layout • use an indentation aware editor or IDE • Get over it!
  • 8. Python is Interactive Python 2.6.1 (r261:67515, Jul 7 2009, 23:51:51) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>
  • 10. FIZZ BUZZ def fizzbuzz(n):     for i in range(n + 1):         if not i % 3:             print "Fizz",         if not i % 5:             print "Buzz",         if i % 3 and i % 5:             print i,         print fizzbuzz(50)
  • 11. FIZZ BUZZ def fizzbuzz(n):     for i in range(n + 1):         if not i % 3:             print "Fizz",         if not i % 5:             print "Buzz",         if i % 3 and i % 5:             print i,         print fizzbuzz(50)
  • 12. FIZZ BUZZ (OO) class FizzBuzzWriter(object):     def __init__(self, limit):         self.limit = limit             def run(self):         for n in range(1, self.limit + 1):             self.write_number(n)         def write_number(self, n):         if not n % 3:             print "Fizz",         if not n % 5:             print "Buzz",         if n % 3 and n % 5:             print n,         print         fizzbuzz = FizzBuzzWriter(50) fizzbuzz.run()
  • 13. A Spell Checker in 21 Lines of Code • Written by Peter Norvig • Duplicated in many languages • Simple Spellchecking algorithm based on probability • http://norvig.com/spell-correct.html
  • 14. The Approach • Census by frequency • Morph the word (werd) • Insertions: waerd, wberd, werzd • Deletions: wrd, wed, wer • Transpositions: ewrd, wred, wedr • Replacements: aerd, ward, wbrd, word, wzrd, werz • Find the one with the highest frequency: were
  • 15. Norvig Spellchecker import re, collections def words(text):    return re.findall('[a-z]+', text.lower()) def train(words):    model = collections.defaultdict(int)     for w in words:        model[w] += 1     return model NWORDS = train(words(file('big.txt').read())) alphabet = 'abcdefghijklmnopqrstuvwxyz' def edits1(word):    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]    deletes    = [a + b[1:] for a, b in s if b]    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]    inserts    = [a + c + b     for a, b in s for c in alphabet]    return set(deletes + transposes + replaces + inserts) def known_edits2(word):    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS) def known(words):    return set(w for w in words if w in NWORDS) def correct(word):    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]    return max(candidates, key=NWORDS.get)
  • 16. Regular Expressions def words(text): return re.findall('[a-z]+', text.lower()) >>> words("The cat in the hat!") ['the', 'cat', 'in', 'the', 'hat']
  • 17. Dictionaries >>> d = {'cat':1} >>> d {'cat': 1} >>> d['cat'] 1 >>> d['cat'] += 1 >>> d {'cat': 2} >>> d['dog'] += 1 Traceback (most recent call last):  File "<stdin>", line 1, in <module> KeyError: 'dog' 
  • 18. defaultdict # Has a factory for missing keys >>> d = collections.defaultdict(int) >>> d['dog'] += 1 >>> d {'dog': 1} >>> int <type 'int'> >>> int() 0 def train(words):    model = collections.defaultdict(int)    for w in words:        model[w] += 1    return model >>> train(words("The cat in the hat!")) {'cat': 1, 'the': 2, 'hat': 1, 'in': 1}              
  • 19. Reading the File    >>> text = file('big.txt').read()    >>> NWORDS = train(words(text))    >>> NWORDS    {'nunnery': 3, 'presnya': 1, 'woods': 22, 'clotted': 1, 'spiders': 1,    'hanging': 42, 'disobeying': 2, 'scold': 3, 'originality': 6,    'grenadiers': 8, 'pigment': 16, 'appropriation': 6, 'strictest': 1,    'bringing': 48, 'revelers': 1, 'wooded': 8, 'wooden': 37,    'wednesday': 13, 'shows': 50, 'immunities': 3, 'guardsmen': 4,    'sooty': 1, 'inevitably': 32, 'clavicular': 9, 'sustaining': 5,    'consenting': 1, 'scraped': 21, 'errors': 16, 'semicircular': 1,    'cooking': 6, 'spiroch': 25, 'designing': 1, 'pawed': 1,    'succumb': 12, 'shocks': 1, 'crouch': 2, 'chins': 1, 'awistocwacy': 1,    'sunbeams': 1, 'perforations': 6, 'china': 43, 'affiliated': 4,    'chunk': 22, 'natured': 34, 'uplifting': 1, 'slaveholders': 2,    'climbed': 13, 'controversy': 33, 'natures': 2, 'climber': 1,    'lency': 2, 'joyousness': 1, 'reproaching': 3, 'insecurity': 1,    'abbreviations': 1, 'definiteness': 1, 'music': 56, 'therefore': 186,    'expeditionary': 3, 'primeval': 1, 'unpack': 1, 'circumstances': 107,    ... (about 6500 more lines) ...    >>> NWORDS['the']    80030    >>> NWORDS['unusual']    32    >>> NWORDS['cephalopod']    0
  • 20. Training the Probability Model import re, collections def words(text): return re.findall('[a-z]+', text.lower()) def train(words):    model = collections.defaultdict(int)    for w in words:    model[w] += 1    return model NWORDS = train(words(file('big.txt').read()))
  • 21. List Comprehensions # These two are equivalent: result = [] for v in iter: if cond:    result.append(expr) [ expr for v in iter if cond ] # You can nest loops also: result = [] for v1 in iter1:    for v2 in iter2:        if cond:            result.append(expr) [ expr for v1 in iter1 for v2 in iter2 if cond ]  
  • 22. String Slicing >>> word = "spam" >>> word[:1] 's' >>> word[1:] 'pam' >>> (word[:1], word[1:]) ('s', 'pam') >>> range(len(word) + 1) [0, 1, 2, 3, 4] >>> [(word[:i], word[i:]) for i in range(len(word) + 1)] [('', 'spam'), ('s', 'pam'), ('sp', 'am'), ('spa', 'm'), ('spam', '')]
  • 23. Deletions >>> word = "spam" >>> s = [(word[:i], word[i:]) for i in range(len(word) + 1)] >>> deletes = [a + b[1:] for a, b in s if b] >>> deletes ['pam', 'sam', 'spm', 'spa'] >>> a, b = ('s', 'pam') >>> a 's' >>> b 'pam' >>> bool('pam') True >>> bool('') False
  • 24. Transpositions For example: teh => the >>> transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1] >>> transposes ['psam', 'sapm', 'spma']
  • 25. Replacements >>> alphabet = "abcdefghijklmnopqrstuvwxyz" >>> replaces = [a + c + b[1:]  for a, b in s for c in alphabet if b] >>> replaces ['apam', 'bpam', ..., 'zpam', 'saam', ..., 'szam', ..., 'spaz']
  • 26. Insertion >>> alphabet = "abcdefghijklmnopqrstuvwxyz" >>> inserts = [a + c + b  for a, b in s for c in alphabet] >>> inserts ['aspam', ..., 'zspam', 'sapam', ..., 'szpam', 'spaam', ..., 'spamz']
  • 27. Find all Edits alphabet = 'abcdefghijklmnopqrstuvwxyz' def edits1(word):    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]    deletes = [a + b[1:] for a, b in s if b]    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]    replaces = [a + c + b[1:] for a, b in s for c in alphabet if b]    inserts = [a + c + b  for a, b in s for c in alphabet]    return set(deletes + transposes + replaces + inserts) >>> edits1("spam") set(['sptm', 'skam', 'spzam', 'vspam', 'spamj', 'zpam', 'sbam', 'spham', 'snam', 'sjpam', 'spma', 'swam', 'spaem', 'tspam', 'spmm', 'slpam', 'upam', 'spaim', 'sppm', 'spnam', 'spem', 'sparm', 'spamr', 'lspam', 'sdpam', 'spams', 'spaml', 'spamm', 'spamn', 'spum', 'spamh', 'spami', 'spatm', 'spamk', 'spamd', ..., 'spcam', 'spamy'])
  • 28. Known Words def known(words):        """ Return the known words from `words`. """        return set(w for w in words if w in NWORDS)
  • 29. Correct def known(words):    """ Return the known words from `words`. """    return set(w for w in words if w in NWORDS) def correct(word):    candidates = known([word]) or known(edits1(word)) or [word]    return max(candidates, key=NWORDS.get) >>> bool(set([])) False >>> correct("computr") 'computer' >>> correct("computor") 'computer' >>> correct("computerr") 'computer'
  • 30. Edit Distance 2 def known_edits2(word):    return set(        e2            for e1 in edits1(word)                for e2 in edits1(e1)                    if e2 in NWORDS        ) def correct(word):    candidates = known([word]) or known(edits1(word)) or        known_edits2(word) or [word]    return max(candidates, key=NWORDS.get) >>> correct("conpuler") 'computer' >>> correct("cmpuler") 'computer'
  • 31. import re, collections def words(text):    return re.findall('[a-z]+', text.lower()) def train(words):    model = collections.defaultdict(int)     for w in words:        model[w] += 1     return model NWORDS = train(words(file('big.txt').read())) alphabet = 'abcdefghijklmnopqrstuvwxyz' def edits1(word):    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]    deletes    = [a + b[1:] for a, b in s if b]    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]    inserts    = [a + c + b     for a, b in s for c in alphabet]    return set(deletes + transposes + replaces + inserts) def known_edits2(word):    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS) def known(words):    return set(w for w in words if w in NWORDS) def correct(word):    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]    return max(candidates, key=NWORDS.get)
  • 32. Comparing Python & Java Versions • http://raelcunha.com/spell-correct.php • 35 lines of Java
  • 33. import java.io.*; import java.util.*; import java.util.regex.*; class Spelling { " private final HashMap<String, Integer> nWords = new HashMap<String, Integer>(); " public Spelling(String file) throws IOException { " " BufferedReader in = new BufferedReader(new FileReader(file)); " " Pattern p = Pattern.compile("w+"); " " for(String temp = ""; temp != null; temp = in.readLine()){ " " " Matcher m = p.matcher(temp.toLowerCase()); " " " while(m.find()) nWords.put((temp = m.group()), nWords.containsKey(temp) ? nWords.get(temp) + 1 : 1); " " } " " in.close(); " } " private final ArrayList<String> edits(String word) { " " ArrayList<String> result = new ArrayList<String>(); " " for(int i=0; i < word.length(); ++i) result.add(word.substring(0, i) + word.substring(i+1)); " " for(int i=0; i < word.length()-1; ++i) result.add(word.substring(0, i) + word.substring(i+1, i+2) + word.substring(i, i+1) + word.substring(i+2)); " " for(int i=0; i < word.length(); ++i) for(char c='a'; c <= 'z'; ++c) result.add(word.substring(0, i) + String.valueOf(c) + word.substring(i+1)); " " for(int i=0; i <= word.length(); ++i) for(char c='a'; c <= 'z'; ++c) result.add(word.substring(0, i) + String.valueOf(c) + word.substring(i)); " " return result; " } " public final String correct(String word) { " " if(nWords.containsKey(word)) return word; " " ArrayList<String> list = edits(word); " " HashMap<Integer, String> candidates = new HashMap<Integer, String>(); " " for(String s : list) if(nWords.containsKey(s)) candidates.put(nWords.get(s),s); " " if(candidates.size() > 0) return candidates.get(Collections.max(candidates.keySet())); " " for(String s : list) for(String w : edits(s)) if(nWords.containsKey(w)) candidates.put(nWords.get(w),w); " " return candidates.size() > 0 ? candidates.get(Collections.max(candidates.keySet())) : word; " } " public static void main(String args[]) throws IOException { " " if(args.length > 0) System.out.println((new Spelling("big.txt")).correct(args[0])); " } }
  • 34. import re, collections def words(text):    return re.findall('[a-z]+', text.lower()) def train(words):    model = collections.defaultdict(int)     for w in words:        model[w] += 1     return model NWORDS = train(words(file('big.txt').read())) alphabet = 'abcdefghijklmnopqrstuvwxyz' def edits1(word):    s = [(word[:i], word[i:]) for i in range(len(word) + 1)]    deletes    = [a + b[1:] for a, b in s if b]    transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]    replaces   = [a + c + b[1:] for a, b in s for c in alphabet if b]    inserts    = [a + c + b     for a, b in s for c in alphabet]    return set(deletes + transposes + replaces + inserts) def known_edits2(word):    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS) def known(words):    return set(w for w in words if w in NWORDS) def correct(word):    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]    return max(candidates, key=NWORDS.get)
  • 35. IDE for Python • IDE’s for Python include: • PyDev for Eclipse • WingIDE • IDLE for Windows/ Linux/ Mac • there’s more
  • 36. Why Python ROCKS • Elegant and readable language - “Executable Pseudocode” • Standard Libraries - “Batteries Included” • Very High level Datatypes • Dynamically Typed • It’s FUN!
  • 37. An Open Source Community • Projects: Plone, Zope, Grok, BFG, Django, SciPy & NumPy, Google App Engine, PyGame • PyCon
  • 38. Resources • PyGTA • Toronto Plone Users • Toronto Django Users • Stackoverflow • Dive into Python • Python Tutorial
  • 39. Thanks • I’d love to hear your questions or comments on this presentation. Reach me at: • jbb@scryent.com • http://twitter.com/hexsprite