SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Real World Haskell:
     Lecture 7

   Bryan O’Sullivan


     2009-12-09
Getting things done




   It’s great to dwell so much on purity, but we’d like to maybe use
   Haskell for practical programming some time.
   This leaves us concerned with talking to the outside world.
Word count


   import System . E n v i r o n m e n t ( getArgs )
   import C o n t r o l . Monad ( f o r M )

   countWords p a t h = do
     c o n t e n t <− r e a d F i l e p a t h
     l e t numWords = l e n g t h ( words c o n t e n t )
     putStrLn ( show numWords ++ ” ” ++ p a t h )

  main = do
    a r g s <− getArgs
   mapM countWords a r g s
New notation!


   There was a lot to digest there. Let’s run through it all, from top
   to bottom.


   import System . E n v i r o n m e n t ( getArgs )

   “Import only the thing named getArgs from
   System.Environment.”
   Without an explicit (comma separated) list of names to import,
   everything that a module exports is imported into this one.
The do block




   Notice that this function’s body starts with the keyword do:

   countWords p a t h = do
     ...

   That keyword introduces a series of actions. Each action is
   somewhat similar to a statement in C or Python.
Executing an action and using its result



   The first line of our function’s body:

   countWords p a t h = do
     c o n t e n t <− r e a d F i l e p a t h

   This performs the action “readFile path”, and assigns the result
   to the name “content”.
   The special notation “<−” makes it clear that we are executing an
   action, i.e. not applying a pure function.
Applying a pure function



   We can use the let keyword inside a do block, and it applies a
   pure function, but the code that follows does not need to start
   with an in keyword.

      l e t numWords = l e n g t h ( words c o n t e n t )
      putStrLn ( show numWords ++ ” ” ++ p a t h )

   With both let and <−, the result is immutable as usual, and stays
   in scope until the end of the do block.
Executing an action




   This line executes an action, and ignores its return value:

      putStrLn ( show numWords ++ ”               ” ++ p a t h )
Compare and contrast

   Wonder how different imperative programming in Haskell is from
   other languages?


   def c o u n t w o r d s ( p a t h ) :
       c o n t e n t = open ( p a t h ) . r e a d ( )
       num words = l e n ( c o n t e n t . s p l i t ( ) )
       p r i n t r e p r ( num words ) + ” ” + p a t h


   countWords p a t h = do
       c o n t e n t <− r e a d F i l e p a t h
       l e t numWords = l e n g t h ( words c o n t e n t )
       putStrLn ( show numWords ++ ” ” ++ p a t h )
A few handy rules




   When you want to introduce a new name inside a do block:
       Use name <− action to perform an action and keep its result.
       Use let name = expression to evaluate a pure expression, and
       omit the in.
More adventures with ghci



   If we load our source file into ghci, we get an interesting type
   signature:

   *Main> :type countWords
   countWords :: FilePath -> IO ()


   See the result type of IO ()? That means “this is an action that
   performs I/O, and which returns nothing useful when it’s done.”
Main



  In Haskell, the entry point to an executable is named main. You
  are shocked by this, I am sure.

  main = do
    a r g s <− getArgs
   mapM countWords a r g s

  Instead of main being passed its command line arguments as in C,
  it uses the getArgs action to retrieve them.
What’s this mapM business?


  The map function can only call pure functions, so it has an
  equivalent named mapM that maps an impure action over a list of
  arguments and returns the list of results.
  The mapM function has a cousin, mapM , that throws away the
  result of each action it performs.
  In other words, this is one way to perform a loop over a list in
  Haskell.
  “mapM countWords args” means “apply countWords to every
  element of args in turn, and throw away each result.”
Compare and contrast II, electric boogaloo


   These don’t look as similar as their predecessors:

   def main ( ) :
       f o r name i n s y s . a r g v [ 1 : ] :
             c o u n t w o r d s ( name )


   main = do
       a r g s <− getArgs
       mapM countWords a r g s

   I wonder if we could change that.
Idiomatic word count in Python




   If we were writing “real” Python code, it would look more like this:

   def main ( ) :
       for path in s y s . argv [ 1 : ] :
            c = open ( p a t h ) . r e a d ( )
            p r i n t l e n ( c . s p l i t ( ) ) , path
Meet forM



  In the Control .Monad module, there are two functions named
  forM and forM . They are nothing more than mapM and mapM
  with their arguments flipped.

  In other words, these are identical:

  mapM countWords a r g s
  f o r M a r g s countWords

  That seems a bit gratuitous. Why should we care?
Function application as an operator


   In our last lecture, we were introduced to function composition:
   f . g =  x −> f ( g x )

   We can also write a function to apply a function:
   f $ x = f x

   This operator has a very low precedence, so we can use it to get
   rid of parentheses. Sometimes this makes code easier to read:
   putStrLn ( show numWords ++ ”               ” ++ p a t h )
   putStrLn $ show numWords ++ ”               ” ++ p a t h
Idiomatic word counting in Haskell


   See what’s different about this word counting?

   main = do
     a r g s <− getArgs
     f o r M a r g s $  a r g −> do
          c o n t e n t <− r e a d F i l e a r g
          l e t l e n = l e n g t h ( words c o n t e n t )
         putStrLn ( show l e n ++ ” ” ++ a r g )

   Doesn’t that use of forM look remarkably like a for loop in some
   other language? That’s because it is one.
The reason for the $




   Notice that the body of the forM loop is an anonymous function
   of one argument.
   We put the $ in there so that we wouldn’t have to either wrap the
   entire function body in parentheses, or split it out and give it a
   name.
The good




  Here’s our original code, using the $ operator:

     f o r M a r g s $  a r g −> do
          c o n t e n t <− r e a d F i l e a r g
          l e t l e n = l e n g t h ( words c o n t e n t )
         putStrLn ( show l e n ++ ” ” ++ a r g )
The bad




  If we omit the $, we could use parentheses:

     f o r M a r g s (  a r g −> do
          c o n t e n t <− r e a d F i l e a r g
          l e t l e n = l e n g t h ( words c o n t e n t )
         putStrLn ( show l e n ++ ” ” ++ a r g ) )
And the ugly


   Or we could give our loop body a name:

      l e t body a r g = do
           c o n t e n t <− r e a d F i l e a r g
           l e t l e n = l e n g t h ( words c o n t e n t )
          putStrLn ( show l e n ++ ” ” ++ a r g ) )
      f o r M a r g s body

   Giving such a trivial single-use function a name seems gratuitous.
   Nevertheless, it should be clear that all three pieces of code are
   identical in their operation.
Trying it out

   Let’s assume we’ve saved our source file as WC.hs, and give it a try:

   $ ghc --make WC
   [1 of 1] Compiling Main ( WC.hs, WC.o )
   Linking WC ...

   $ du -h ascii.txt
   58M ascii.txt

   $ time ./WC ascii.txt
   9873630 ascii.txt

   real 0m8.043s
Comparison shopping



   How does the performance of our WC program compare with the
   system’s built-in wc command?

   $ export LANG=C
   $ time wc -w ascii.txt
   9873630 ascii.txt

   real 0m0.447s

   Ouch! The C version is almost 18 times faster.
A second try



   Does it help if we recompile with optimisation?

   $ ghc -fforce-recomp -O --make WC
   $ time ./WC ascii.txt
   9873630 ascii.txt

   real 0m7.696s

   So that made our code 5% faster. Ugh.
What’s going on here?



   Remember that in Haskell, a string is a list. And a list is
   represented as a linked list.
   This means that every character gets its own list element, and list
   elements are not allocated contiguously. For large data structures,
   list overhead is negligible, but for characters, it’s a total killer.
   So what’s to be done?
   Enter the bytestring.
The original code




   main = do
     a r g s <− getArgs
     f o r M a r g s $  a r g −> do
          c o n t e n t <− r e a d F i l e a r g
          l e t l e n = l e n g t h ( words c o n t e n t )
         putStrLn ( show l e n ++ ” ” ++ a r g )
The bytestring code

   A bytestring is a contiguously-allocated array of bytes. Because
   there’s no pointer-chasing overhead, this should be faster.

   import q u a l i f i e d Data . B y t e S t r i n g . Char8 a s B

   main = do
     a r g s <− getArgs
     f o r M a r g s $  a r g −> do
          c o n t e n t <− B . r e a d F i l e a r g
          l e t l e n = l e n g t h (B . words c o n t e n t )
         putStrLn ( show l e n ++ ” ” ++ a r g )

   Notice the import qualified—this allows us to write B instead of
   Data.ByteString.Char8 wherever we want to use a name imported
   from that module.
So is it faster?

   How does this code perform?

   $ time ./WC ascii.txt
   9873630 ascii.txt

   real 0m8.043s

   $ time ./WC-BS ascii.txt
   9873630 ascii.txt

   real 0m1.434s


   Not bad! We’re 6x faster than the String code, and now just 3x
   slower than the C code.
Seriously? Bytes for text?




   There is, of course, a snag to using bytestrings: they’re strings of
   bytes, not characters.
   This is the 21st century, and everyone should be using Unicode
   now, right?
   Our answer to this problem in Haskell is to use a package named
   Data.Text.
Unicode-aware word count


   import q u a l i f i e d Data . Text a s T
   import Data . Text . E n c o d i n g ( d e c o d e U t f 8 )
   import q u a l i f i e d Data . B y t e S t r i n g . Char8 a s B

   main = do
     a r g s <− getArgs
     f o r M a r g s $  a r g −> do
          b y t e s <− B . r e a d F i l e a r g
          l e t content = decodeUtf8 bytes
                  l e n = l e n g t h (T . words c o n t e n t )
         putStrLn ( show l e n ++ ” ” ++ a r g )
What happens here?




  Notice that we still use bytestrings to read the initial data in.
  Now, however, we use decodeUtf8 to turn the raw bytes from
  UTF-8 into the Unicode representation that Data.Text uses
  internally.
  We then use Data.Text’s words function to split the big string into
  a list of words.
Comparing Unicode performance
   For comparison, let’s first try a Unicode-aware word count in C, on
   a file containing 112.6 million characters of UTF-8-encoded Greek:

   $ du -h greek.txt
   196M greek.txt

   $ export LANG=en_US.UTF-8
   $ time wc -w greek.txt
   16917959 greek.txt

   real 0m8.306s

   $ time ./WC-T greek.txt
   16917959 greek.txt

   real 0m7.350s
What did we just see?




   Wow! Our tiny Haskell program is actually 13% faster than the
   system’s wc command!
   This suggests that if we choose the right representation, we can
   write real-world code that is both brief and highly efficient.
   This ought to be immensely cheering.

Weitere ähnliche Inhalte

Was ist angesagt?

Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in PythonSujith Kumar
 
Scala 3 enum for a terser Option Monad Algebraic Data Type
Scala 3 enum for a terser Option Monad Algebraic Data TypeScala 3 enum for a terser Option Monad Algebraic Data Type
Scala 3 enum for a terser Option Monad Algebraic Data TypePhilip Schwarz
 
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Philip Schwarz
 
The Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and FoldThe Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and FoldPhilip Schwarz
 
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...Philip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...Philip Schwarz
 
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...Philip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2Philip Schwarz
 
Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...Piotr Paradziński
 
Haskell retrospective
Haskell retrospectiveHaskell retrospective
Haskell retrospectivechenge2k
 
Introduction to Python - Part Two
Introduction to Python - Part TwoIntroduction to Python - Part Two
Introduction to Python - Part Twoamiable_indian
 
Introduction to Python - Part Three
Introduction to Python - Part ThreeIntroduction to Python - Part Three
Introduction to Python - Part Threeamiable_indian
 
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...Philip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4Philip Schwarz
 
Haskell for data science
Haskell for data scienceHaskell for data science
Haskell for data scienceJohn Cant
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5Philip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...Philip Schwarz
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsPython Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsRanel Padon
 
A brief introduction to lisp language
A brief introduction to lisp languageA brief introduction to lisp language
A brief introduction to lisp languageDavid Gu
 

Was ist angesagt? (20)

Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
 
Scala 3 enum for a terser Option Monad Algebraic Data Type
Scala 3 enum for a terser Option Monad Algebraic Data TypeScala 3 enum for a terser Option Monad Algebraic Data Type
Scala 3 enum for a terser Option Monad Algebraic Data Type
 
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
 
The Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and FoldThe Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and Fold
 
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit - Haskell and...
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
 
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
 
Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...Big picture of category theory in scala with deep dive into contravariant and...
Big picture of category theory in scala with deep dive into contravariant and...
 
Haskell retrospective
Haskell retrospectiveHaskell retrospective
Haskell retrospective
 
Introduction to Python - Part Two
Introduction to Python - Part TwoIntroduction to Python - Part Two
Introduction to Python - Part Two
 
Introduction to Python - Part Three
Introduction to Python - Part ThreeIntroduction to Python - Part Three
Introduction to Python - Part Three
 
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...The Functional Programming Triad of Folding, Scanning and Iteration - a first...
The Functional Programming Triad of Folding, Scanning and Iteration - a first...
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 4
 
Haskell for data science
Haskell for data scienceHaskell for data science
Haskell for data science
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 5
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part ...
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsPython Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular Expressions
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 
A brief introduction to lisp language
A brief introduction to lisp languageA brief introduction to lisp language
A brief introduction to lisp language
 

Andere mochten auch

香港六合彩身在富中
香港六合彩身在富中香港六合彩身在富中
香港六合彩身在富中zfbsok
 
Opdracht Informatica
Opdracht InformaticaOpdracht Informatica
Opdracht Informaticastijngheysen
 
Increase Adwords Profits
Increase Adwords ProfitsIncrease Adwords Profits
Increase Adwords ProfitsGreg Cassar
 
PresentacióN1
PresentacióN1PresentacióN1
PresentacióN1sbn
 
Slide Share Thin2
Slide Share Thin2Slide Share Thin2
Slide Share Thin2noo0002
 
Are We There Yet
Are We There YetAre We There Yet
Are We There Yetjkchapman
 
Security Storage Containers
Security Storage ContainersSecurity Storage Containers
Security Storage Containersguest08ead4
 
數位學院:好事吸引力
數位學院:好事吸引力數位學院:好事吸引力
數位學院:好事吸引力Isaac Chao
 
Pp Msae A Deans
Pp Msae A DeansPp Msae A Deans
Pp Msae A DeansAMTR
 
The Power of Story and 5 Ways to Share it Visually
The Power of Story and 5 Ways to Share it VisuallyThe Power of Story and 5 Ways to Share it Visually
The Power of Story and 5 Ways to Share it VisuallyDesignWise Studios
 
Hagelin Scientistsfor Peace
Hagelin Scientistsfor PeaceHagelin Scientistsfor Peace
Hagelin Scientistsfor PeaceAMTR
 
Puste Miejsce
Puste MiejscePuste Miejsce
Puste MiejsceEwaB
 
U Nas Jesien
U Nas JesienU Nas Jesien
U Nas JesienEwaB
 
Website www.ewa.bicom.pl karaokekids
Website www.ewa.bicom.pl karaokekidsWebsite www.ewa.bicom.pl karaokekids
Website www.ewa.bicom.pl karaokekidsEwaB
 
Pass Serie Spistol3 Revised
Pass Serie Spistol3 RevisedPass Serie Spistol3 Revised
Pass Serie Spistol3 RevisedTom Neuman
 

Andere mochten auch (20)

香港六合彩身在富中
香港六合彩身在富中香港六合彩身在富中
香港六合彩身在富中
 
Opdracht Informatica
Opdracht InformaticaOpdracht Informatica
Opdracht Informatica
 
Increase Adwords Profits
Increase Adwords ProfitsIncrease Adwords Profits
Increase Adwords Profits
 
PresentacióN1
PresentacióN1PresentacióN1
PresentacióN1
 
Slide Share Thin2
Slide Share Thin2Slide Share Thin2
Slide Share Thin2
 
Are We There Yet
Are We There YetAre We There Yet
Are We There Yet
 
Security Storage Containers
Security Storage ContainersSecurity Storage Containers
Security Storage Containers
 
數位學院:好事吸引力
數位學院:好事吸引力數位學院:好事吸引力
數位學院:好事吸引力
 
Pel1
Pel1Pel1
Pel1
 
Pp Msae A Deans
Pp Msae A DeansPp Msae A Deans
Pp Msae A Deans
 
The Power of Story and 5 Ways to Share it Visually
The Power of Story and 5 Ways to Share it VisuallyThe Power of Story and 5 Ways to Share it Visually
The Power of Story and 5 Ways to Share it Visually
 
Unit 4
Unit 4Unit 4
Unit 4
 
Hagelin Scientistsfor Peace
Hagelin Scientistsfor PeaceHagelin Scientistsfor Peace
Hagelin Scientistsfor Peace
 
Puste Miejsce
Puste MiejscePuste Miejsce
Puste Miejsce
 
Toekomst van het leren
Toekomst van het lerenToekomst van het leren
Toekomst van het leren
 
U Nas Jesien
U Nas JesienU Nas Jesien
U Nas Jesien
 
Encuesta
EncuestaEncuesta
Encuesta
 
Website www.ewa.bicom.pl karaokekids
Website www.ewa.bicom.pl karaokekidsWebsite www.ewa.bicom.pl karaokekids
Website www.ewa.bicom.pl karaokekids
 
Pass Serie Spistol3 Revised
Pass Serie Spistol3 RevisedPass Serie Spistol3 Revised
Pass Serie Spistol3 Revised
 
Pronk like you mean it
Pronk like you mean itPronk like you mean it
Pronk like you mean it
 

Ähnlich wie Real World Haskell: Lecture 7

Rewriting Java In Scala
Rewriting Java In ScalaRewriting Java In Scala
Rewriting Java In ScalaSkills Matter
 
C Programming Interview Questions
C Programming Interview QuestionsC Programming Interview Questions
C Programming Interview QuestionsGradeup
 
Computer notes - Hashing
Computer notes - HashingComputer notes - Hashing
Computer notes - Hashingecomputernotes
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming iiPrashant Kalkar
 
computer notes - Data Structures - 35
computer notes - Data Structures - 35computer notes - Data Structures - 35
computer notes - Data Structures - 35ecomputernotes
 
Library functions in c++
Library functions in c++Library functions in c++
Library functions in c++Neeru Mittal
 
01 stack 20160908_jintaek_seo
01 stack 20160908_jintaek_seo01 stack 20160908_jintaek_seo
01 stack 20160908_jintaek_seoJinTaek Seo
 
Kamil witecki asynchronous, yet readable, code
Kamil witecki asynchronous, yet readable, codeKamil witecki asynchronous, yet readable, code
Kamil witecki asynchronous, yet readable, codeKamil Witecki
 
C++11 - A Change in Style - v2.0
C++11 - A Change in Style - v2.0C++11 - A Change in Style - v2.0
C++11 - A Change in Style - v2.0Yaser Zhian
 
Phyton Learning extracts
Phyton Learning extracts Phyton Learning extracts
Phyton Learning extracts Pavan Babu .G
 
Functional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative ProgrammersFunctional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative ProgrammersChris
 
Python 培训讲义
Python 培训讲义Python 培训讲义
Python 培训讲义leejd
 

Ähnlich wie Real World Haskell: Lecture 7 (20)

Scala @ TomTom
Scala @ TomTomScala @ TomTom
Scala @ TomTom
 
Rewriting Java In Scala
Rewriting Java In ScalaRewriting Java In Scala
Rewriting Java In Scala
 
C Programming Interview Questions
C Programming Interview QuestionsC Programming Interview Questions
C Programming Interview Questions
 
Computer notes - Hashing
Computer notes - HashingComputer notes - Hashing
Computer notes - Hashing
 
Functional programming ii
Functional programming iiFunctional programming ii
Functional programming ii
 
Haskell Jumpstart
Haskell JumpstartHaskell Jumpstart
Haskell Jumpstart
 
computer notes - Data Structures - 35
computer notes - Data Structures - 35computer notes - Data Structures - 35
computer notes - Data Structures - 35
 
lab4_php
lab4_phplab4_php
lab4_php
 
lab4_php
lab4_phplab4_php
lab4_php
 
Library functions in c++
Library functions in c++Library functions in c++
Library functions in c++
 
01 stack 20160908_jintaek_seo
01 stack 20160908_jintaek_seo01 stack 20160908_jintaek_seo
01 stack 20160908_jintaek_seo
 
2. operator
2. operator2. operator
2. operator
 
Kamil witecki asynchronous, yet readable, code
Kamil witecki asynchronous, yet readable, codeKamil witecki asynchronous, yet readable, code
Kamil witecki asynchronous, yet readable, code
 
C++11 - A Change in Style - v2.0
C++11 - A Change in Style - v2.0C++11 - A Change in Style - v2.0
C++11 - A Change in Style - v2.0
 
Perl Presentation
Perl PresentationPerl Presentation
Perl Presentation
 
C Programming Homework Help
C Programming Homework HelpC Programming Homework Help
C Programming Homework Help
 
Phyton Learning extracts
Phyton Learning extracts Phyton Learning extracts
Phyton Learning extracts
 
Functional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative ProgrammersFunctional Programming Concepts for Imperative Programmers
Functional Programming Concepts for Imperative Programmers
 
Python 培训讲义
Python 培训讲义Python 培训讲义
Python 培训讲义
 
Python basic
Python basicPython basic
Python basic
 

Kürzlich hochgeladen

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 

Kürzlich hochgeladen (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 

Real World Haskell: Lecture 7

  • 1. Real World Haskell: Lecture 7 Bryan O’Sullivan 2009-12-09
  • 2. Getting things done It’s great to dwell so much on purity, but we’d like to maybe use Haskell for practical programming some time. This leaves us concerned with talking to the outside world.
  • 3. Word count import System . E n v i r o n m e n t ( getArgs ) import C o n t r o l . Monad ( f o r M ) countWords p a t h = do c o n t e n t <− r e a d F i l e p a t h l e t numWords = l e n g t h ( words c o n t e n t ) putStrLn ( show numWords ++ ” ” ++ p a t h ) main = do a r g s <− getArgs mapM countWords a r g s
  • 4. New notation! There was a lot to digest there. Let’s run through it all, from top to bottom. import System . E n v i r o n m e n t ( getArgs ) “Import only the thing named getArgs from System.Environment.” Without an explicit (comma separated) list of names to import, everything that a module exports is imported into this one.
  • 5. The do block Notice that this function’s body starts with the keyword do: countWords p a t h = do ... That keyword introduces a series of actions. Each action is somewhat similar to a statement in C or Python.
  • 6. Executing an action and using its result The first line of our function’s body: countWords p a t h = do c o n t e n t <− r e a d F i l e p a t h This performs the action “readFile path”, and assigns the result to the name “content”. The special notation “<−” makes it clear that we are executing an action, i.e. not applying a pure function.
  • 7. Applying a pure function We can use the let keyword inside a do block, and it applies a pure function, but the code that follows does not need to start with an in keyword. l e t numWords = l e n g t h ( words c o n t e n t ) putStrLn ( show numWords ++ ” ” ++ p a t h ) With both let and <−, the result is immutable as usual, and stays in scope until the end of the do block.
  • 8. Executing an action This line executes an action, and ignores its return value: putStrLn ( show numWords ++ ” ” ++ p a t h )
  • 9. Compare and contrast Wonder how different imperative programming in Haskell is from other languages? def c o u n t w o r d s ( p a t h ) : c o n t e n t = open ( p a t h ) . r e a d ( ) num words = l e n ( c o n t e n t . s p l i t ( ) ) p r i n t r e p r ( num words ) + ” ” + p a t h countWords p a t h = do c o n t e n t <− r e a d F i l e p a t h l e t numWords = l e n g t h ( words c o n t e n t ) putStrLn ( show numWords ++ ” ” ++ p a t h )
  • 10. A few handy rules When you want to introduce a new name inside a do block: Use name <− action to perform an action and keep its result. Use let name = expression to evaluate a pure expression, and omit the in.
  • 11. More adventures with ghci If we load our source file into ghci, we get an interesting type signature: *Main> :type countWords countWords :: FilePath -> IO () See the result type of IO ()? That means “this is an action that performs I/O, and which returns nothing useful when it’s done.”
  • 12. Main In Haskell, the entry point to an executable is named main. You are shocked by this, I am sure. main = do a r g s <− getArgs mapM countWords a r g s Instead of main being passed its command line arguments as in C, it uses the getArgs action to retrieve them.
  • 13. What’s this mapM business? The map function can only call pure functions, so it has an equivalent named mapM that maps an impure action over a list of arguments and returns the list of results. The mapM function has a cousin, mapM , that throws away the result of each action it performs. In other words, this is one way to perform a loop over a list in Haskell. “mapM countWords args” means “apply countWords to every element of args in turn, and throw away each result.”
  • 14. Compare and contrast II, electric boogaloo These don’t look as similar as their predecessors: def main ( ) : f o r name i n s y s . a r g v [ 1 : ] : c o u n t w o r d s ( name ) main = do a r g s <− getArgs mapM countWords a r g s I wonder if we could change that.
  • 15. Idiomatic word count in Python If we were writing “real” Python code, it would look more like this: def main ( ) : for path in s y s . argv [ 1 : ] : c = open ( p a t h ) . r e a d ( ) p r i n t l e n ( c . s p l i t ( ) ) , path
  • 16. Meet forM In the Control .Monad module, there are two functions named forM and forM . They are nothing more than mapM and mapM with their arguments flipped. In other words, these are identical: mapM countWords a r g s f o r M a r g s countWords That seems a bit gratuitous. Why should we care?
  • 17. Function application as an operator In our last lecture, we were introduced to function composition: f . g = x −> f ( g x ) We can also write a function to apply a function: f $ x = f x This operator has a very low precedence, so we can use it to get rid of parentheses. Sometimes this makes code easier to read: putStrLn ( show numWords ++ ” ” ++ p a t h ) putStrLn $ show numWords ++ ” ” ++ p a t h
  • 18. Idiomatic word counting in Haskell See what’s different about this word counting? main = do a r g s <− getArgs f o r M a r g s $ a r g −> do c o n t e n t <− r e a d F i l e a r g l e t l e n = l e n g t h ( words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g ) Doesn’t that use of forM look remarkably like a for loop in some other language? That’s because it is one.
  • 19. The reason for the $ Notice that the body of the forM loop is an anonymous function of one argument. We put the $ in there so that we wouldn’t have to either wrap the entire function body in parentheses, or split it out and give it a name.
  • 20. The good Here’s our original code, using the $ operator: f o r M a r g s $ a r g −> do c o n t e n t <− r e a d F i l e a r g l e t l e n = l e n g t h ( words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g )
  • 21. The bad If we omit the $, we could use parentheses: f o r M a r g s ( a r g −> do c o n t e n t <− r e a d F i l e a r g l e t l e n = l e n g t h ( words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g ) )
  • 22. And the ugly Or we could give our loop body a name: l e t body a r g = do c o n t e n t <− r e a d F i l e a r g l e t l e n = l e n g t h ( words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g ) ) f o r M a r g s body Giving such a trivial single-use function a name seems gratuitous. Nevertheless, it should be clear that all three pieces of code are identical in their operation.
  • 23. Trying it out Let’s assume we’ve saved our source file as WC.hs, and give it a try: $ ghc --make WC [1 of 1] Compiling Main ( WC.hs, WC.o ) Linking WC ... $ du -h ascii.txt 58M ascii.txt $ time ./WC ascii.txt 9873630 ascii.txt real 0m8.043s
  • 24. Comparison shopping How does the performance of our WC program compare with the system’s built-in wc command? $ export LANG=C $ time wc -w ascii.txt 9873630 ascii.txt real 0m0.447s Ouch! The C version is almost 18 times faster.
  • 25. A second try Does it help if we recompile with optimisation? $ ghc -fforce-recomp -O --make WC $ time ./WC ascii.txt 9873630 ascii.txt real 0m7.696s So that made our code 5% faster. Ugh.
  • 26. What’s going on here? Remember that in Haskell, a string is a list. And a list is represented as a linked list. This means that every character gets its own list element, and list elements are not allocated contiguously. For large data structures, list overhead is negligible, but for characters, it’s a total killer. So what’s to be done? Enter the bytestring.
  • 27. The original code main = do a r g s <− getArgs f o r M a r g s $ a r g −> do c o n t e n t <− r e a d F i l e a r g l e t l e n = l e n g t h ( words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g )
  • 28. The bytestring code A bytestring is a contiguously-allocated array of bytes. Because there’s no pointer-chasing overhead, this should be faster. import q u a l i f i e d Data . B y t e S t r i n g . Char8 a s B main = do a r g s <− getArgs f o r M a r g s $ a r g −> do c o n t e n t <− B . r e a d F i l e a r g l e t l e n = l e n g t h (B . words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g ) Notice the import qualified—this allows us to write B instead of Data.ByteString.Char8 wherever we want to use a name imported from that module.
  • 29. So is it faster? How does this code perform? $ time ./WC ascii.txt 9873630 ascii.txt real 0m8.043s $ time ./WC-BS ascii.txt 9873630 ascii.txt real 0m1.434s Not bad! We’re 6x faster than the String code, and now just 3x slower than the C code.
  • 30. Seriously? Bytes for text? There is, of course, a snag to using bytestrings: they’re strings of bytes, not characters. This is the 21st century, and everyone should be using Unicode now, right? Our answer to this problem in Haskell is to use a package named Data.Text.
  • 31. Unicode-aware word count import q u a l i f i e d Data . Text a s T import Data . Text . E n c o d i n g ( d e c o d e U t f 8 ) import q u a l i f i e d Data . B y t e S t r i n g . Char8 a s B main = do a r g s <− getArgs f o r M a r g s $ a r g −> do b y t e s <− B . r e a d F i l e a r g l e t content = decodeUtf8 bytes l e n = l e n g t h (T . words c o n t e n t ) putStrLn ( show l e n ++ ” ” ++ a r g )
  • 32. What happens here? Notice that we still use bytestrings to read the initial data in. Now, however, we use decodeUtf8 to turn the raw bytes from UTF-8 into the Unicode representation that Data.Text uses internally. We then use Data.Text’s words function to split the big string into a list of words.
  • 33. Comparing Unicode performance For comparison, let’s first try a Unicode-aware word count in C, on a file containing 112.6 million characters of UTF-8-encoded Greek: $ du -h greek.txt 196M greek.txt $ export LANG=en_US.UTF-8 $ time wc -w greek.txt 16917959 greek.txt real 0m8.306s $ time ./WC-T greek.txt 16917959 greek.txt real 0m7.350s
  • 34. What did we just see? Wow! Our tiny Haskell program is actually 13% faster than the system’s wc command! This suggests that if we choose the right representation, we can write real-world code that is both brief and highly efficient. This ought to be immensely cheering.