Using functional concepts in Python. Introduction to functional programming and exploring each of the concepts, like map, filter and reduce in detail and how functional programming can help creating massively parallel software systems
2. Agenda
★ Basics of Functional Programming
★ Functional Programming and Python
★ Everyday Functional Programming
★ Scaling with Functional Programming
★ Performance Implications
★ Data Explorations
3. House rules
★ Show of hands please
★ Ask anytime - Raise your hands
★ Be hands on
★ Free to use online resources
★ One step at a time
★ No-one left out policy
14. Myths of Functional
Programming
★ It requires immutability/pureness
★ It requires an advanced type system
★ It is significantly less efficient
★ It makes you learn advanced math
★ You must give up all your imperative
programming notions
★ Object orientation and functional paradigms
are incompatible
★ Functional programs are easier to debug
★ Dynamic types are better than Static types
15. Functional methodology in Python
★ itertools and functools
★ Decorators and Generators
★ What python CAN do
○ lazy eval, lambda, map, reduce, filter
★ What python CANNOT do (pragmatically)
○ Eg: Tail Recursion, Pure immutability, Pure Functions
22. Map - Explained
from urllib import urlopen
urls = ['http://www.google.com',
'http://www.wikipedia.com',
'http://www.apple.com',
'http://www.python.org'
]
result = []
for item in urls:
result.append(urlopen(item))
return result
def fib(n):
a, b = 0, 1
for i in range(n):
a, b = b, a + b
....
integers = [1, 2, 3, 4, 5]
result = []
for item in integers:
result.append(fib(item))
return result
?
23. Map - Explained
from urllib import urlopen
urls = ['http://www.google.com',
'http://www.wikipedia.com',
'http://www.apple.com',
'http://www.python.org'
]
result = []
for item in urls:
result.append(urlopen(item))
return result
def fib(n):
a, b = 0, 1
for i in range(n):
a, b = b, a + b
...
integers = [1, 2, 3, 4, 5]
result = []
for item in integers:
result.append(fib(item))
return result
def map(function, sequence):
result = []
for item in sequence:
result.append(function(item))
return result
html_texts = map(urlopen, urls)
fib_integers = map(fib, integers)
24. Lambda
count_lambda =lambda w: len(w)
print map(count_lambda, 'It is raining cats and dogs'.split())
#conditions in lambda
lambda x: True if x % 2 == 0 else False
26. Reduce - Explained
# Sum of a list of numbers
def add(x, y):
return x + y
def sum(data):
result = 0
for x in data:
result = add(result, x)
return result sum([5, 2, 3])
# Smallest in a list
def lesser(x, y):
if x < y:
return x
else:
return y
def min(data):
result = 999999999999
for x in data:
result = lesser(result, x)
return result min([5, 2, 3])
?
27. Reduce - Explained
# Sum of a list of numbers
def add(x, y):
return x + y
def sum(data):
result = 0
for x in data:
result = add(result, x)
return result sum([5, 2, 3])
# Smallest in a list
def lesser(x, y):
if x < y:
return x
else:
return y
def min(data):
result = 999999999999
for x in data:
result = lesser(result, x)
return result min([5, 2, 3])
# Sum
result = sum(data)
result = reduce(add, data, 0)
# Min
result = min(data)
result = reduce(lesser, data, 9999999999)
29. iter function
numbers = [1,2,3]
it = iter(numbers)
# using while and StopIteration Exception
try:
while True:
print it.next()
except StopIteration:
print "Complete"
# as iterator in for loop
it = iter(numbers)
for value in it:
print value
40. The Bad Parts
★ Memory Inefficiencies
★ Purity
★ No Tail Recursion
★ Innately imperative (Guido)
★ Class based type system
★ Only imperative Error Handling
(Exception)
★ Function Overloading
★ Mutable variables
Python vs Functional
41. Thinking about Scalability with Functions
★ map-reduce-filter - recipe for distributed computing
★ shared states- to be or not to be
★ immutable 'variables'
★ independent functions
★ Execution Pipelines - chained map-reduce
42. Performance v/s Scalability
★ Functional Programs vs Object Oriented Programs
★ CPU intensive processes vs I/O intensive processes
★ The curse of GIL - workarounds
○ multiprocessing
★ Benchmarking
○ %timeit
45. why!
★ no side effects -no state, no deadlocks, no semaphores
★ automatic parallelization - unlimited scalability
★ composability - break down into smaller functions
★ Testing -independent functions; well defined arguments and return values
★ partial evaluation - pass around half baked functions instead of objects
★ elegant code -forces to write logically correct programs
46. Hands On: Let’s do some Data Wrangling
Blockbuster Database (http://www.crowdflower.com/data-for-everyone)
➢ Which genre has most movies?
➢ Which movie studio gross the most?
➢ Sort by most grossed and Highly rated movies
demo code here