SlideShare ist ein Scribd-Unternehmen logo
1 von 53
Downloaden Sie, um offline zu lesen
Processing data with Python
using standard library modules you
(probably) never knew about




                               PyCon AU 2012
                                      Tutorial
                                Graeme Cross
This is an introduction to…
•   Python's data containers
•   Choosing the best container for the job
•   Iterators and comprehensions
•   Useful modules for working with data
I am assuming…
• You know how to program (at least a bit!)
• Some familiarity with Python basics
• Python 2.7 (but not Python 3)
  – Note: the examples work with Python 2.7
  – Some won't work with Python 2.6
Along the way, we will…
•   Check out the AFL ladder
•   Count yearly winners of the Sheffield Shield
•   Parlez-vous Français?
•   Find the suburb that uses the most water
•   Use set theory on the X-Men
•   Analyse Olympic medal results
The examples and these notes
• We will use a number of demo scripts
  – This is an interactive tutorial
  – If you have a laptop, join in!
  – We will use ipython for the demos
• Code is on USB sticks being passed around
• Code & these notes are also at:
  http://www.curiousvenn.com/
ipython
• A very flexible Python shell
• Works on all platforms
• Lots of really nice features:
  – Command line editing with history
  – Coloured syntax
  – Edit and run within the shell
  – Web notebooks, Visual Studio, Qt, …
  – %lsmagic
• http://ipython.org/
  http://ipython.org/presentation.html
Interlude 0
Simple Python data types
• Numbers
  answer = 42
  pi = 3.141592654

• Boolean
  exit_flag = True
  winning = False

• Strings
  my_name = "Malcolm Reynolds"
Python data containers
• Can hold more than one object
• Have a consistent interface between
  different types of containers
• Fundamental to working in Python
Python sequences
•   A container that is ordered by position
•   Store and access objects by position
•   Python has 7 fundamental sequence types
•   Examples include:
    – string
    – list
    – tuple
Fundamental sequence opreations
Name             Purpose
x in s           True if the object x is in the sequence s
x not in s       True if the object x is NOT in the sequence s
s + t            Concatenate two sequences
s * n            Concatentate sequence s n times
s[i]             Return the object in sequence s at position i
s[i:j]           Return a slice of objects in the sequence s from position i
s[i:j:k]         to j
                 Can have an optional step value, k
len(s)           Return the number of objects in the sequence s

s.index(x)       Return the index of the first occurrence of x in the
                 sequence s
s.count(x)       Return the number of occurrences of x in the sequence s

min(s), max(s)   Return smallest or largest element in sequence s
The string type
• A string is a sequence of characters
• Strings are delimited by quotes (' or ")

  my_name = "Buffy"
  her_name = 'Dawn'


• Strings are immutable
  – You can't assign to a position
string positions
• The sequence starts at position 0
• Use location[n] to access the character at
  position n from the start, first = 0
• Use location[-n] to access the character
  that is at position n from the end, last = -1
string position example
• A string with 26 characters
• Positions index from 0 to 25

  location = "Arkham Asylum, Gotham City"
  location[0]  'A'
  location[15]  'G'
  location[25] or location[-1]  'y'
  location[-4]  'C'
string position slices
• You can return a substring by slicing with
  two positions
• For example, return the string from position
  15 up to (but not including) position 21

  location = "Arkham Asylum, Gotham City"
  location[15]  'G'
  location[20]  'm'
  location[15:21]  Gotham'
Useful string functions & methods
Name              Purpose
len(s)            Calculate the length of the string s
+                 Add two strings together
*                 Repeat a string
s.find(x)         Find the first position of x in the string s
s.count(x)        Count the number of times x is in the string s
s.upper()         Return a new string that is all uppercase or
s.lower()         lowercase
s.replace(x, y)   Return a new string that has replaced the substring
                  x with the new substring y
s.strip()         Return a new string with whitespace stripped from
                  the ends
s.format()        Format a string's contents
Some string demos
Example: string1.py
Iterating through a string
• You can visit each character in a string
• This process is called iteration
• The 'for' keyword is used to step or iterate
  through a sequence
• Example: string2.py
From here…
• We didn't mention:
  – Unicode
  – The "is" methods, such as islower()
  – Splitting and joining strings – that's coming!
  – Different ways of formatting strings
  – Strings from files, internet data, XML, JSON,…
• The string module
• The re module
  – Regular expression pattern matching
Interlude 1
list
• A simple sequence of objects
• Just like a string…
  – All the methods you have already learnt work
• Except…
  – It is mutable (you can change the contents)
  – Not just for characters
• Objects separated by commas
• Wrapped inside square brackets
Useful list functions & methods
 Name             Purpose
 len(x)           Calculate the length of the list x
 x.append(y)      Add the object y to the end of the list x
 x.extend(y)      Extend the list x with the contents of the list y
 x.insert(n, y)   Insert object y into the list x before position n
 x.pop()          Remove and return the first object in the list
 x.pop(n)         Remove and return the object at position n
 x.count(y)       Count the number of times object y is in the list
 x.sort()         Sorts the list x in-place
 sorted(x)        Returns sorted version of x (does not change x)
 x.reverse()      Reverses the list x in-place
Some list demos
Examples: list1.py → list4.py
list iteration
• Same concept as string iteration
• Example: list5.py
list of lists
• A list can hold any other object
• This includes other lists
• Example: lol.py
list comprehensions
• A compact way to create a list
• Build the list by applying an expression to
  each element in a sequence
• Can contain logic and function calls
• Uses square brackets (list)
• Syntax:

[output_func(var) for var in input_list if predicate]
list comprehension advantages
• Focus is on the logic, not loop mechanics
• Less code, more readable code
• Can nest comprehensions, keeping logic in a
  single place
• Easier to map algorithm to working code
• Widely used in Python (and many other
  languages, especially functional languages)
list comprehension examples
• Some simple examples:

 single_digits = [n for n in range(10)]
 squares = [x*x for x in range(10)]
 even_sq = [i*i for i in squares if i%2 == 0]

• Example: comp1.py
list traps
• Copying → actually copying references
• Passing lists in to functions/methods
Interlude 2
dictionary
• A container that maps a key to a value
• The key: any object that is hashable
    – It's the hash that is the “real” key
• The value: any object
    – Numbers, strings, lists, other dictionaries, ...
•   Perfect for look-up tables
•   It is not a sequence!
•   It is not ordered
•   Fundamental Python concept and type
Working with dictionaries
• Getting values and testing for keys:
  – my_dict[key]
  – my_dict.get(key)
  – my_dict.get(key, default)
  – my_dict.has_key(key)
  – key in my_dict
  – my_dict.keys()
  – my_dict.values()
  – my_dict.items()
Iterating over dictionaries
• Can iterate via keys, values or pairs of (key,
  value)
• Simple example: dict1.py
• Detailed example: dict2.py
From here with dictionaries
• Lots that we didn't mention!
• Python docs have a good overview of:
  – Methods
  – Views
• “Learning Python” and “Hello Python”
  – Detailed coverage of dictionaries
Interlude 3
tuple
• Immutable, ordered sequence
• Tuples are basically immutable lists
• Delimited by round brackets and separated
  by commas
  years = (1940, 1945, 1967, 1968, 1970)


• Remember to use a comma with a tuple
  containing a single object
  numbers = (1.23,)
tuple packing
• Tuples can be packed and unpacked
  to/from other objects:
  x, y, z = (640, 480, 1.2)
  my_point = x, y, z
tuple immutability
• The tuple may be immutable
• But objects inside may not!
• Example: tuple1.py
tuples: why bother?
• A limited, read-only kind of list
• Less methods than lists
• But:
  – can pass data around with (more) confidence
  – can use (hashable) tuples as dictionary keys
collections.namedtuple
• Remembering the order of data in a tuple
  can be tricky
• collections.namedtuple names each field
  in a tuple
• More readable than a normal tuple
• Less setup than a class for just data
• Similar to a C struct
• Example: namedtuple1.py
set
• An unordered, mutable collection
• No duplicate entries
• Perfect for logic and set queries
  – eg. union and intersection tests
• Is not a sequence
  – No indexing, slices, etc
• Can convert to/from other sequences
• Example: set1.py
frozenset
• Same as set except it is immutable
• So can be used as a key for dictionaries
Some less used collections
• array         Fast, single type lists
  – bytearray Create an array of bytes
  – If you are using array, check out numpy
  – http://numpy.scipy.org/


• buffer        Working with raw bytes
• Queue         Sharing data between threads
• struct        Convert between Python & C
Interlude 4
Useful inbuilt functions
• Let's play with these sequence functions:
  – enumerate()
  – min() and max()
  – range() and xrange()
  – reversed()
  – sorted()
  – sum()
Functional functions
• A number of very useful functions to
  process sequences:
  – all()
  – any()
  – filter()
  – map()
  – reduce()
  – zip()
• Example: func1.py
The collections module
• Counter
  – More elegant than using a dictionary and
    manually counting
  – Example: counter1.py
• namedtuple
  – See the previous example
The itertools module
• Very useful standard library module
• Ideal for fast, memory efficient iteration
• http://www.doughellmann.com/PyMOTW/itertools/
• Highlights:
  – izip() → memory efficient version of zip()
  – imap() & starmap() → memory efficient and
    flexible versions of map()
  – count() → iterator for counting
  – dropwhile() & takewhile() → processing lists
    until condition is reached
The pprint module
•   Pretty-print complex data structures
•   Flexible and customisable
•   Uses __repr()__
•   pprint()
•   pformat()
•   Example: pprint1.py
Interlude 5
Containers redux
•   list: for simple position-based storage
•   tuple: read-only lists
•   dict: when you need access by key
•   Counter: for counting keys
•   namedtuple: tuple < namedtuple < class
•   array: lists with speed
•   set: for sets! :)
For more information...
• The Python tutorial
  – http://python.org/
• Python Module of the Week
  – http://www.doughellmann.com/PyMOTW/
Some good books…
• “Learning Python”, Mark Lutz
• “Hello Python”, Anthony Briggs

Weitere ähnliche Inhalte

Was ist angesagt?

A brief introduction to lisp language
A brief introduction to lisp languageA brief introduction to lisp language
A brief introduction to lisp languageDavid Gu
 
Introduction to Haskell: 2011-04-13
Introduction to Haskell: 2011-04-13Introduction to Haskell: 2011-04-13
Introduction to Haskell: 2011-04-13Jay Coskey
 
Proposals for new function in Java SE 9 and beyond
Proposals for new function in Java SE 9 and beyondProposals for new function in Java SE 9 and beyond
Proposals for new function in Java SE 9 and beyondBarry Feigenbaum
 
Prototypes in Pharo
Prototypes in PharoPrototypes in Pharo
Prototypes in PharoESUG
 
Learn a language : LISP
Learn a language : LISPLearn a language : LISP
Learn a language : LISPDevnology
 
Scala Back to Basics: Type Classes
Scala Back to Basics: Type ClassesScala Back to Basics: Type Classes
Scala Back to Basics: Type ClassesTomer Gabel
 
The Ring programming language version 1.6 book - Part 182 of 189
The Ring programming language version 1.6 book - Part 182 of 189The Ring programming language version 1.6 book - Part 182 of 189
The Ring programming language version 1.6 book - Part 182 of 189Mahmoud Samir Fayed
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Chia-Chi Chang
 
C# quick ref (bruce 2016)
C# quick ref (bruce 2016)C# quick ref (bruce 2016)
C# quick ref (bruce 2016)Bruce Hantover
 
Introduction To Lisp
Introduction To LispIntroduction To Lisp
Introduction To Lispkyleburton
 
Java 1.5 - whats new and modern patterns (2007)
Java 1.5 - whats new and modern patterns (2007)Java 1.5 - whats new and modern patterns (2007)
Java 1.5 - whats new and modern patterns (2007)Peter Antman
 
arrays-120712074248-phpapp01
arrays-120712074248-phpapp01arrays-120712074248-phpapp01
arrays-120712074248-phpapp01Abdul Samee
 
Real World Haskell: Lecture 1
Real World Haskell: Lecture 1Real World Haskell: Lecture 1
Real World Haskell: Lecture 1Bryan O'Sullivan
 
Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Kel Cecil
 
String and string manipulation x
String and string manipulation xString and string manipulation x
String and string manipulation xShahjahan Samoon
 
Introduction To Programming with Python
Introduction To Programming with PythonIntroduction To Programming with Python
Introduction To Programming with PythonSushant Mane
 

Was ist angesagt? (20)

01. haskell introduction
01. haskell introduction01. haskell introduction
01. haskell introduction
 
A brief introduction to lisp language
A brief introduction to lisp languageA brief introduction to lisp language
A brief introduction to lisp language
 
Introduction to Haskell: 2011-04-13
Introduction to Haskell: 2011-04-13Introduction to Haskell: 2011-04-13
Introduction to Haskell: 2011-04-13
 
Proposals for new function in Java SE 9 and beyond
Proposals for new function in Java SE 9 and beyondProposals for new function in Java SE 9 and beyond
Proposals for new function in Java SE 9 and beyond
 
Prototypes in Pharo
Prototypes in PharoPrototypes in Pharo
Prototypes in Pharo
 
Learn a language : LISP
Learn a language : LISPLearn a language : LISP
Learn a language : LISP
 
Scala Back to Basics: Type Classes
Scala Back to Basics: Type ClassesScala Back to Basics: Type Classes
Scala Back to Basics: Type Classes
 
Prolog & lisp
Prolog & lispProlog & lisp
Prolog & lisp
 
The Ring programming language version 1.6 book - Part 182 of 189
The Ring programming language version 1.6 book - Part 182 of 189The Ring programming language version 1.6 book - Part 182 of 189
The Ring programming language version 1.6 book - Part 182 of 189
 
(Ai lisp)
(Ai lisp)(Ai lisp)
(Ai lisp)
 
02. haskell motivation
02. haskell motivation02. haskell motivation
02. haskell motivation
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)
 
C# quick ref (bruce 2016)
C# quick ref (bruce 2016)C# quick ref (bruce 2016)
C# quick ref (bruce 2016)
 
Introduction To Lisp
Introduction To LispIntroduction To Lisp
Introduction To Lisp
 
Java 1.5 - whats new and modern patterns (2007)
Java 1.5 - whats new and modern patterns (2007)Java 1.5 - whats new and modern patterns (2007)
Java 1.5 - whats new and modern patterns (2007)
 
arrays-120712074248-phpapp01
arrays-120712074248-phpapp01arrays-120712074248-phpapp01
arrays-120712074248-phpapp01
 
Real World Haskell: Lecture 1
Real World Haskell: Lecture 1Real World Haskell: Lecture 1
Real World Haskell: Lecture 1
 
Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!Hey! There's OCaml in my Rust!
Hey! There's OCaml in my Rust!
 
String and string manipulation x
String and string manipulation xString and string manipulation x
String and string manipulation x
 
Introduction To Programming with Python
Introduction To Programming with PythonIntroduction To Programming with Python
Introduction To Programming with Python
 

Ähnlich wie Processing data with Python, using standard library modules you (probably) never knew about

Python first day
Python first dayPython first day
Python first dayfarkhand
 
Functional Python Webinar from October 22nd, 2014
Functional Python Webinar from October 22nd, 2014Functional Python Webinar from October 22nd, 2014
Functional Python Webinar from October 22nd, 2014Reuven Lerner
 
PPT on Python - illustrating Python for BBA, B.Tech
PPT on Python - illustrating Python for BBA, B.TechPPT on Python - illustrating Python for BBA, B.Tech
PPT on Python - illustrating Python for BBA, B.Techssuser2678ab
 
INTRODUCTION TO PYTHON.pptx
INTRODUCTION TO PYTHON.pptxINTRODUCTION TO PYTHON.pptx
INTRODUCTION TO PYTHON.pptxNimrahafzal1
 
standard template library(STL) in C++
standard template library(STL) in C++standard template library(STL) in C++
standard template library(STL) in C++•sreejith •sree
 
Python Programming and GIS
Python Programming and GISPython Programming and GIS
Python Programming and GISJohn Reiser
 
Introduction to Python for Plone developers
Introduction to Python for Plone developersIntroduction to Python for Plone developers
Introduction to Python for Plone developersJim Roepcke
 
Python_Unit_III.pptx
Python_Unit_III.pptxPython_Unit_III.pptx
Python_Unit_III.pptxssuserc755f1
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithmsJulie Iskander
 
Functions, List and String methods
Functions, List and String methodsFunctions, List and String methods
Functions, List and String methodsPranavSB
 
Programming in Python
Programming in Python Programming in Python
Programming in Python Tiji Thomas
 

Ähnlich wie Processing data with Python, using standard library modules you (probably) never knew about (20)

Python first day
Python first dayPython first day
Python first day
 
Python first day
Python first dayPython first day
Python first day
 
Python Tutorial Part 1
Python Tutorial Part 1Python Tutorial Part 1
Python Tutorial Part 1
 
Functional Python Webinar from October 22nd, 2014
Functional Python Webinar from October 22nd, 2014Functional Python Webinar from October 22nd, 2014
Functional Python Webinar from October 22nd, 2014
 
PPT on Python - illustrating Python for BBA, B.Tech
PPT on Python - illustrating Python for BBA, B.TechPPT on Python - illustrating Python for BBA, B.Tech
PPT on Python - illustrating Python for BBA, B.Tech
 
INTRODUCTION TO PYTHON.pptx
INTRODUCTION TO PYTHON.pptxINTRODUCTION TO PYTHON.pptx
INTRODUCTION TO PYTHON.pptx
 
standard template library(STL) in C++
standard template library(STL) in C++standard template library(STL) in C++
standard template library(STL) in C++
 
Python Programming and GIS
Python Programming and GISPython Programming and GIS
Python Programming and GIS
 
Introduction to Python for Plone developers
Introduction to Python for Plone developersIntroduction to Python for Plone developers
Introduction to Python for Plone developers
 
Python_Unit_III.pptx
Python_Unit_III.pptxPython_Unit_III.pptx
Python_Unit_III.pptx
 
Data structures and algorithms
Data structures and algorithmsData structures and algorithms
Data structures and algorithms
 
Functions, List and String methods
Functions, List and String methodsFunctions, List and String methods
Functions, List and String methods
 
Python ppt
Python pptPython ppt
Python ppt
 
Programming in Python
Programming in Python Programming in Python
Programming in Python
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
Python Basics
Python BasicsPython Basics
Python Basics
 
python1.ppt
python1.pptpython1.ppt
python1.ppt
 
Lenguaje Python
Lenguaje PythonLenguaje Python
Lenguaje Python
 
coolstuff.ppt
coolstuff.pptcoolstuff.ppt
coolstuff.ppt
 

Kürzlich hochgeladen

IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum ComputingGDSC PJATK
 

Kürzlich hochgeladen (20)

IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Introduction to Quantum Computing
Introduction to Quantum ComputingIntroduction to Quantum Computing
Introduction to Quantum Computing
 

Processing data with Python, using standard library modules you (probably) never knew about

  • 1. Processing data with Python using standard library modules you (probably) never knew about PyCon AU 2012 Tutorial Graeme Cross
  • 2. This is an introduction to… • Python's data containers • Choosing the best container for the job • Iterators and comprehensions • Useful modules for working with data
  • 3. I am assuming… • You know how to program (at least a bit!) • Some familiarity with Python basics • Python 2.7 (but not Python 3) – Note: the examples work with Python 2.7 – Some won't work with Python 2.6
  • 4. Along the way, we will… • Check out the AFL ladder • Count yearly winners of the Sheffield Shield • Parlez-vous Français? • Find the suburb that uses the most water • Use set theory on the X-Men • Analyse Olympic medal results
  • 5. The examples and these notes • We will use a number of demo scripts – This is an interactive tutorial – If you have a laptop, join in! – We will use ipython for the demos • Code is on USB sticks being passed around • Code & these notes are also at: http://www.curiousvenn.com/
  • 6. ipython • A very flexible Python shell • Works on all platforms • Lots of really nice features: – Command line editing with history – Coloured syntax – Edit and run within the shell – Web notebooks, Visual Studio, Qt, … – %lsmagic • http://ipython.org/ http://ipython.org/presentation.html
  • 8. Simple Python data types • Numbers answer = 42 pi = 3.141592654 • Boolean exit_flag = True winning = False • Strings my_name = "Malcolm Reynolds"
  • 9. Python data containers • Can hold more than one object • Have a consistent interface between different types of containers • Fundamental to working in Python
  • 10. Python sequences • A container that is ordered by position • Store and access objects by position • Python has 7 fundamental sequence types • Examples include: – string – list – tuple
  • 11. Fundamental sequence opreations Name Purpose x in s True if the object x is in the sequence s x not in s True if the object x is NOT in the sequence s s + t Concatenate two sequences s * n Concatentate sequence s n times s[i] Return the object in sequence s at position i s[i:j] Return a slice of objects in the sequence s from position i s[i:j:k] to j Can have an optional step value, k len(s) Return the number of objects in the sequence s s.index(x) Return the index of the first occurrence of x in the sequence s s.count(x) Return the number of occurrences of x in the sequence s min(s), max(s) Return smallest or largest element in sequence s
  • 12. The string type • A string is a sequence of characters • Strings are delimited by quotes (' or ") my_name = "Buffy" her_name = 'Dawn' • Strings are immutable – You can't assign to a position
  • 13. string positions • The sequence starts at position 0 • Use location[n] to access the character at position n from the start, first = 0 • Use location[-n] to access the character that is at position n from the end, last = -1
  • 14. string position example • A string with 26 characters • Positions index from 0 to 25 location = "Arkham Asylum, Gotham City" location[0]  'A' location[15]  'G' location[25] or location[-1]  'y' location[-4]  'C'
  • 15. string position slices • You can return a substring by slicing with two positions • For example, return the string from position 15 up to (but not including) position 21 location = "Arkham Asylum, Gotham City" location[15]  'G' location[20]  'm' location[15:21]  Gotham'
  • 16. Useful string functions & methods Name Purpose len(s) Calculate the length of the string s + Add two strings together * Repeat a string s.find(x) Find the first position of x in the string s s.count(x) Count the number of times x is in the string s s.upper() Return a new string that is all uppercase or s.lower() lowercase s.replace(x, y) Return a new string that has replaced the substring x with the new substring y s.strip() Return a new string with whitespace stripped from the ends s.format() Format a string's contents
  • 18. Iterating through a string • You can visit each character in a string • This process is called iteration • The 'for' keyword is used to step or iterate through a sequence • Example: string2.py
  • 19. From here… • We didn't mention: – Unicode – The "is" methods, such as islower() – Splitting and joining strings – that's coming! – Different ways of formatting strings – Strings from files, internet data, XML, JSON,… • The string module • The re module – Regular expression pattern matching
  • 21. list • A simple sequence of objects • Just like a string… – All the methods you have already learnt work • Except… – It is mutable (you can change the contents) – Not just for characters • Objects separated by commas • Wrapped inside square brackets
  • 22. Useful list functions & methods Name Purpose len(x) Calculate the length of the list x x.append(y) Add the object y to the end of the list x x.extend(y) Extend the list x with the contents of the list y x.insert(n, y) Insert object y into the list x before position n x.pop() Remove and return the first object in the list x.pop(n) Remove and return the object at position n x.count(y) Count the number of times object y is in the list x.sort() Sorts the list x in-place sorted(x) Returns sorted version of x (does not change x) x.reverse() Reverses the list x in-place
  • 23. Some list demos Examples: list1.py → list4.py
  • 24. list iteration • Same concept as string iteration • Example: list5.py
  • 25. list of lists • A list can hold any other object • This includes other lists • Example: lol.py
  • 26. list comprehensions • A compact way to create a list • Build the list by applying an expression to each element in a sequence • Can contain logic and function calls • Uses square brackets (list) • Syntax: [output_func(var) for var in input_list if predicate]
  • 27. list comprehension advantages • Focus is on the logic, not loop mechanics • Less code, more readable code • Can nest comprehensions, keeping logic in a single place • Easier to map algorithm to working code • Widely used in Python (and many other languages, especially functional languages)
  • 28. list comprehension examples • Some simple examples: single_digits = [n for n in range(10)] squares = [x*x for x in range(10)] even_sq = [i*i for i in squares if i%2 == 0] • Example: comp1.py
  • 29. list traps • Copying → actually copying references • Passing lists in to functions/methods
  • 31. dictionary • A container that maps a key to a value • The key: any object that is hashable – It's the hash that is the “real” key • The value: any object – Numbers, strings, lists, other dictionaries, ... • Perfect for look-up tables • It is not a sequence! • It is not ordered • Fundamental Python concept and type
  • 32. Working with dictionaries • Getting values and testing for keys: – my_dict[key] – my_dict.get(key) – my_dict.get(key, default) – my_dict.has_key(key) – key in my_dict – my_dict.keys() – my_dict.values() – my_dict.items()
  • 33. Iterating over dictionaries • Can iterate via keys, values or pairs of (key, value) • Simple example: dict1.py • Detailed example: dict2.py
  • 34. From here with dictionaries • Lots that we didn't mention! • Python docs have a good overview of: – Methods – Views • “Learning Python” and “Hello Python” – Detailed coverage of dictionaries
  • 36. tuple • Immutable, ordered sequence • Tuples are basically immutable lists • Delimited by round brackets and separated by commas years = (1940, 1945, 1967, 1968, 1970) • Remember to use a comma with a tuple containing a single object numbers = (1.23,)
  • 37. tuple packing • Tuples can be packed and unpacked to/from other objects: x, y, z = (640, 480, 1.2) my_point = x, y, z
  • 38. tuple immutability • The tuple may be immutable • But objects inside may not! • Example: tuple1.py
  • 39. tuples: why bother? • A limited, read-only kind of list • Less methods than lists • But: – can pass data around with (more) confidence – can use (hashable) tuples as dictionary keys
  • 40. collections.namedtuple • Remembering the order of data in a tuple can be tricky • collections.namedtuple names each field in a tuple • More readable than a normal tuple • Less setup than a class for just data • Similar to a C struct • Example: namedtuple1.py
  • 41. set • An unordered, mutable collection • No duplicate entries • Perfect for logic and set queries – eg. union and intersection tests • Is not a sequence – No indexing, slices, etc • Can convert to/from other sequences • Example: set1.py
  • 42. frozenset • Same as set except it is immutable • So can be used as a key for dictionaries
  • 43. Some less used collections • array Fast, single type lists – bytearray Create an array of bytes – If you are using array, check out numpy – http://numpy.scipy.org/ • buffer Working with raw bytes • Queue Sharing data between threads • struct Convert between Python & C
  • 45. Useful inbuilt functions • Let's play with these sequence functions: – enumerate() – min() and max() – range() and xrange() – reversed() – sorted() – sum()
  • 46. Functional functions • A number of very useful functions to process sequences: – all() – any() – filter() – map() – reduce() – zip() • Example: func1.py
  • 47. The collections module • Counter – More elegant than using a dictionary and manually counting – Example: counter1.py • namedtuple – See the previous example
  • 48. The itertools module • Very useful standard library module • Ideal for fast, memory efficient iteration • http://www.doughellmann.com/PyMOTW/itertools/ • Highlights: – izip() → memory efficient version of zip() – imap() & starmap() → memory efficient and flexible versions of map() – count() → iterator for counting – dropwhile() & takewhile() → processing lists until condition is reached
  • 49. The pprint module • Pretty-print complex data structures • Flexible and customisable • Uses __repr()__ • pprint() • pformat() • Example: pprint1.py
  • 51. Containers redux • list: for simple position-based storage • tuple: read-only lists • dict: when you need access by key • Counter: for counting keys • namedtuple: tuple < namedtuple < class • array: lists with speed • set: for sets! :)
  • 52. For more information... • The Python tutorial – http://python.org/ • Python Module of the Week – http://www.doughellmann.com/PyMOTW/
  • 53. Some good books… • “Learning Python”, Mark Lutz • “Hello Python”, Anthony Briggs