SlideShare ist ein Scribd-Unternehmen logo
1 von 130
Downloaden Sie, um offline zu lesen
Real World Haskell

  Bryan O’Sullivan
 bos@serpentine.com


    2008-09-27
Welcome!




  A few things to expect about this tutorial:
      The pace will be rapid
      Stop me and ask questions—early and often
      I assume no prior Haskell exposure
A little bit about Haskell



   Haskell is a multi-paradigm language.
   It chooses some unusual, but principled, defaults:
       Pure functions
       Non-strict evaluation
       Immutable data
       Static, strong typing
   Why default to these behaviours?
       We want our code to be safe, modular, and tractable.
Pure functions



   Definition
   The result of a pure function depends only on its visible inputs:
       Given identical inputs, it always computes the same result.
       It has no other observable effects.
   What are some consequences of this?
       Modularity leads to simplified reasoning about behaviour.
       Straightforward testing: no need for elaborate frameworks.
Immutable data



   Definition
   Data is immutable (or purely functional) if it is never modified
   after construction.
   To “modify” a value, we create a new value.
   Both new and old versions can coexist afterwards, so we get
   persistent, versioned data for free.
       Modification is often easier than with mutable data.
       In multithreaded code, we do away with much elaborate
       locking.
Static, strong typing




   Definition
   A program is statically typed if we know the type of every
   expression before the program is run.

   Definition
   Code is strongly typed if the absence of certain classes of error can
   be proven statically.
Safety, modularity, and tractability


   Safety:
        As few nasty surprises at runtime as possible.
        Static typing and eased testing give us confidence.
   Modularity:
        We can build big pieces of code from smaller components.
        No need to focus on the details of the smaller parts.
   Tractability:
        All of this fits in our brain comfortably...
        ...leaving plenty of room for the application we care about.
GHC, the Glorious Glasgow Haskell Compiler




   Have you got GHC yet?
       Download installer for Windows, OS X, or Linux here:
       http://www.haskell.org/ghc/download_ghc_683.html
What’s special about GHC?




      Mature, portable, optimising compiler
      Great tools:
          interactive shell and debugger
          time and space profilers
          code coverage analyser
      BSD-licensed, hence suitable for OSS and commercial use
Counting lines


   The classic Unix wc command counts the lines in some files:

   $ time wc -l *.fasta
      9975 1000-Rn_EST.fasta
     14032 chr18.fasta
     14005 chr19.fasta
     13980 chr20.fasta
     42017 chr_all.fasta
     94009 total

   real 0m0.017s
Breaking the problem down




   Subproblems to consider:
       Get our command line arguments
       Read a file
       Split it into lines
       Count the lines
   Let’s work through these in reverse order.
Type signatures



   Definition
   A type signature describes the type of a Haskell expression:

   e : : Double

       We read :: as “left has the type right”.
       So “e has the type Double”.
   Here’s the accompanying definition:
   e = 2.7182818
Type signatures are optional




   In Haskell, most type signatures are optional.
       The compiler can automatically infer types based on our
       usage.
   Why write type signatures at all, then?
       Mostly as useful documentation to ourselves.
GHC’s interactive interpreter


   GHC includes an interactive expression evaluator, ghci.
   Run it from a terminal window or command prompt:

   $ ghci
   GHCi, version 6.8.3: http://www.haskell.org/ghc/
   :? for help
   Loading package base ... linking ... done.
   Prelude>

   The Prelude> text is ghci’s prompt.
   Type :? at the prompt to get (terse) help.
Basic interaction


   Let’s enter some expressions:

   Prelude> 2 + 2
   4
   Prelude> True && False
   False

   We can find out about types:

   Prelude> :type True
   True :: Bool
Writing a list

   Here’s an empty list:

   Prelude> []
   []

   What do we need to create a longer list?
       A value
       An existing list
       Some glue—the : operator

   Prelude> 1:[]
   [1]
   Prelude> 1:2:[]
   [1,2]
Syntactic sugar for lists




   What’s the difference between these?
        1:2:[]
        [1,2]
   Nothing—the latter is purely a notational convenience.
Characters and strings


   One character:

   Prelude> :type ’a’
   ’a’ :: Char

   A string is a list of characters:

   Prelude> ’a’ : ’b’ : []
   quot;abquot;

   Notation:
        Single quotes for one Char
        Double quotes for a string (written [Char])
Function application


   We apply a function to its arguments by juxtaposition:

   Prelude> length [2,4,6]
   3
   Prelude> take 2 [3,6,9,12]
   [3,6]

   Why refer to this as application, instead of the more familiar
   calling?
       Haskell is a non-strict language
       The result may not be computed immediately
Lists are inductive




   Haskell lists are defined inductively.
   A list can be one of two things:
        An empty list
        A value in front of an existing list
   We call our friends [] and : value constructors:
        They construct values that have the type “list of something.”
Counting lines




   Haskell programmers love abstraction.
       We won’t worry about counting lines.
       Instead, we’ll count the elements in any kind of list.
The type signature of a function



   How do we describe a function that computes the length of a list?
   l e n : : [ a ] −> I n t e g e r

       The −> notation denotes a function.
       The function accepts an [a], and returns an Integer.
   What’s an [a]?
       A list, whose elements must all be of some type a.
Counting by induction: the base case




   An empty list has the length zero.
   len [ ] = 0
   This is our first example of pattern matching.
       Our function accepts one argument.
       If the argument is an empty list, we return zero.
   We call this the base case.
Counting by induction: the inductive case



   Let’s see if a list value was created using the : constructor.
   len ( x : xs ) = 1 + len xs
   If the pattern match succeeds:
        The name x is bound to the head of the list.
        The name xs is bound to the tail of the list.
        The body of the definition is used as the result.
The complete function




   Save this in a file named Length.hs:
   l e n : : [ a ] −> I n t e g e r
   len [ ]         = 0
   len ( x : xs ) = 1 + len xs
Load the file into ghci
   In the same directory, run ghci:

   Prelude> :load Length
   [1 of 1] Compiling Main                  ( Length.hs, interprete
   Ok, modules loaded: Main.
   *Main>

   The ghci prompt changes when we load files.
   Let’s try out our function:

   *Main> len []
   0
   *Main> len (1:[])
   1
   *Main> len [4,5,6]
   3
Generating a list from a list



   How might we double every other element of a list?
   double ( a : b : cs ) = a : b ∗ 2 : double cs
   double cs             = cs
   Save this in a file named Double.hs.
   Load the file into ghci.
   Try the following expressions:
        [1..10]
       double [1..10]
Your turn: axpy




      The classic Linpack function axpy computes a × xi + yi over a
      scalar a and each element i of two vectors x and y .
      Define it over two lists of numbers in Haskell.
      How do we handle lists of different lengths?
Splitting text on line boundaries


   Haskell provides a large library of built-in functions, the Prelude.
   Here’s the Prelude’s function for splitting text by lines:
   l i n e s : : S t r i n g −> [ S t r i n g ]
   The type String is a synonym for [Char].
   A ghci experiment:

   *Main> lines quot;foonbarnquot;
   [quot;fooquot;,quot;barquot;]
   *Main> len (lines quot;foonbarnquot;)
   2
Reading a file


   To read a file, we use the Prelude’s readFile function:

   *Main> :type readFile
   readFile :: FilePath -> IO String

   What’s this signature mean?
       The FilePath type is just a synonym for String.
       The type IO String means here be dragons!
       A signature that ends in IO something can have externally
       visible side effects.
       Here, the side effect is “read the contents of a file”.
Side effects



   That innocuous IO in the type is a big deal.
       We can tell by its type signature whether a value might have
       externally visible effects.
       If a type does not include IO, it cannot:
            Read files
            Make network connections
            Launch torpedoes
   The ideal is for most code to not have an IO type.
Counting lines in a file


   If we invoke code that has side effects, our code must by
   implication have side effects too.
   c o u n t L i n e s : : F i l e P a t h −> IO I n t e g e r
   c o u n t L i n e s p a t h = do
       c o n t e n t s <− r e a d F i l e p a t h
       return ( len ( l i n e s contents ))
   We had to add IO to our type here because we use readFile,
   which has side effects.
        Add this code to Length.hs.
A few explanations




      The <− notation means “perform the action on the right,
      and assign the result to the name on the left.”
      name <− a c t i o n

      The return function takes a pure value, and (here) adds IO to
      its type.
Command line arguments


  We use getArgs to obtain command line arguments.
   import System . E n v i r o n m e n t ( getArgs )
   main = do
     a r g s <− getArgs
     putStrLn ( ” h e l l o , a r g s a r e ” ++ show a r g s )
  What’s new here?
       The import directive imports the name getArgs from the
       System.Environment module.
       The ++ operator concatenates two lists.
Pattern matching in an expression



   We use case to pattern match inside an expression.
   −− Does l i s t c o n t a i n two o r more e l e m e n t s ?
   atLeastTwo m y L i s t =
       case m y L i s t o f
          ( a : b : c s ) −> True
                          −> F a l s e
   The expression between case and of is matched in turn against
   each pattern, until one matches.
Irrefutable and wild card patterns



       A pattern usually matches against a value’s constructors.
       In other words, it inspects the structure of the value.
       A simple pattern, e.g. a plain name like a, contains no
       constructors.
       It thus matches any value.

   Definition
   A pattern that always matches any value is called irrefutable.
   The special wild card pattern    is irrefutable, but does not bind a
   value to a name.
Tuples



         A tuple is a fixed-size collection of values.
         Items in a tuple can have different types.
         Example: (True,”foo”)
         This has the type (Bool,String)
   Contrast tuples with lists, to see why we’d want both:
         A list is a variable-sized collection of values.
         Each value in a list must have the same type.
         Example: [True, False]
The zip function




   What does the zip function do? Adventures in function discovery,
   courtesy of ghci:
       Start by inspecting its type, using :type.
       Try it with one set of inputs.
       Then try with another.
Making our program runnable


   Add the following code to Length.hs:
   main = do
    −− E x e r c i s e : g e t t h e command l i n e a r g u m e n t s

     l e n g t h s <− mapM c o u n t L i n e s a r g s
     mapM p r i n t L e n g t h ( z i p a r g s l e n g t h s )
     case a r g s o f
         ( : : ) −> p r i n t L e n g t h ( ” t o t a l ” , sum l e n g t h s )
                     −> r e t u r n ( )
   Don’t forget to add an import directive at the beginning!
The mapM function




     This function applies an action to a list of arguments in turn,
     and returns the list of results.
     The mapM function is similar, but returns the value (), aka
     unit (“nothing”).
     The mapM function is useful for the effects it causes, e.g.
     printing every element of a list.
Write your own printLength function




   Hint: we’ve seen a similar example already, with our getArgs
   example.
Compiling your program



   It’s easy to compile a program with GHC:

   $ ghc --make Length

   What does the compiler do?
       Looks for a source file named Length.hs.
       Compiles it to native code.
       Generates an executable named Length.
Running our program

   Here’s an example from my laptop:

   $ time ./Length *.fasta
   1000-Rn_EST.fasta 9975
   chr18.fasta       14032
   chr19.fasta       14005
   chr20.fasta       13980
   chr_all.fasta     42017
   total             94009

   real 0m1.533s

   Oh, no! Look at that performance!
       90 times slower than wc
Faster file processing




       Lists are wonderful to work with
       But they exact a huge performance toll
   The current best-of-breed alternative for file data:
       ByteString
What is a ByteString?




   They come in two flavours:
       Strict: a single packed array of bytes
       Lazy: a list of 64KB strict chunks
   Each flavour provides a list-like API.
Retooling our word count program



   All we do is add an import and change one function:
   import q u a l i f i e d Data . B y t e S t r i n g . Lazy . Char8 a s B

   c o u n t L i n e s p a t h = do
       c o n t e n t s <− B . r e a d F i l e p a t h
       r e t u r n ( l e n g t h (B . l i n e s c o n t e n t s ) )
   The “B.” prefixes make us pick up the readFile and lines
   functions from the bytestring package.
What happens to performance?




       Haskell lists: 1.533 seconds
       Lazy ByteString: 0.022 seconds
       wc command: 0.015 seconds
   Given the tiny data set size, C and Haskell are in a dead heat.
When to use ByteStrings?



       Any time you deal with binary data
       For text, only if you’re sure it’s 8-bit clean
   For i18n needs, fast packed Unicode is under development.
   Great open source libraries that use ByteStrings:
       binary—parsing/generation of binary data
       zlib and bzlib—support for popular
       compression/decompression formats
       attoparsec—parse text-based files and network protocols
Part 2
A little bit about JSON


   A popular interchange format for structured data: simpler than
   XML, and widely supported.
   Basic types:
       Number
       String
       Boolean
       Null
   Derived types:
       Object: unordered name/value map
       Array: ordered collection of values
JSON at work: Twitter’s search API



   From http://search.twitter.com/search.json?q=haskell:

   {quot;textquot;: quot;Why Haskell? Easiest way to be productivequot;,
    quot;to_user_idquot;: null,
    quot;from_userquot;: quot;galoisincquot;,
    quot;idquot;: 936114469,
    quot;from_user_idquot;: 1633746,
    quot;iso_language_codequot;: quot;enquot;,
    quot;created_atquot;:quot;Fri, 26 Sep 2008 19:15:35 +0000quot;}
JSON in Haskell




   data J S V a l u e
       = JSNull
       | JSBool         ! Bool
       | JSRational     ! Rational
       | JSString       JSString
       | JSArray        [ JSValue ]
       | JSObject       ( JSObject JSValue )
What is a JSString?


   We hide the underlying use of a String:
   newtype J S S t r i n g        = JSONString { f r o m J S S t r i n g : : S

   t o J S S t r i n g : : S t r i n g −> J S S t r i n g
   t o J S S t r i n g = JSONString
   We do the same with JSON objects:
   newtype J S O b j e c t a = JSONObject { f r o m J S O b j e c t : : [ (

   t o J S O b j e c t : : [ ( S t r i n g , a ) ] −> J S O b j e c t a
   t o J S O b j e c t = JSONObject
JSON conversion



  In Haskell, we capture type-dependent patterns using typeclasses:
       The class of types whose values can be converted to and from
       JSON

   data R e s u l t a = Ok a | E r r o r S t r i n g

   c l a s s JSON a where
       readJSON : : J S V a l u e −> R e s u l t a
       showJSON : : a −> J S V a l u e
Why JSString, JSObject, and JSArray?



   Haskell typeclasses give us an open world:
       We can declare a type to be an instance of a class at any time
       In fact, we cannot declare the number of instances to be fixed
   If we left the String type “naked”, what could happen?
       Someone might declare Char to be an instance of JSON
       What if someone declared a JSON a =>JSON [a] instance?
   This is the overlapping instances problem.
Relaxing the overlapping instances restriction




   By default, GHC is conservative:
        It rejects overlapping instances outright
   We can get it to loosen up a bit via a pragma:
   {−# LANGUAGE O v e r l a p p i n g I n s t a n c e s #−}
   If it finds one most specific instance, it will use it, otherwise bail as
   before.
Bool as JSON


  Here’s a simple way to declare the Bool type as an instance of the
  JSON class:
   i n s t a n c e JSON Bool where
       showJSON                = JSBool

     readJSON ( JSBool b ) = Ok b
     readJSON              = E r r o r ” Bool p a r s e f a i l e d ”
  This has a design problem:
      We’ve plumbed our Result type straight in
      If we want to change its implementation, it will be painful
Hiding the plumbing



   A simple (but good enough!) approach to abstraction:
   s u c c e s s : : a −> R e s u l t a
   s u c c e s s k = Ok k

   f a i l u r e : : S t r i n g −> R e s u l t a
   f a i l u r e errMsg = E r r o r errMsg
   Functions like these are sometimes called “smart constructors”.
Does this affect our code much?



   We simply replace the explicit constructors with the functions we
   just defined:
   i n s t a n c e JSON Bool where
       showJSON        = JSBool

      readJSON ( JSBool b )
                  = success b
      readJSON    = f a i l u r e ” Bool p a r s e f a i l e d ”
JSON input and output




   We can now convert between normal Haskell values and our JSON
   representation. But...
       ...we still need to be able to transmit this stuff over the wire.
   Which is more fun to mull over? Parsing!
A functional view of parsing




   Here’s a super-simple perspective:
       Take a piece of data (usually a sequence)
       Try to apply an interpretation to it
   How might we represent this?
A basic type signature for parsing

   Take two type variables, i.e. placeholders for types that we’ll
   substitute later:
        s—the state (data) we want to parse
       a—the type of its interpretation
   We get this generic type signature:
   s −> a
   Let’s make the task more concrete:
       Parse a String as an Int

   S t r i n g −> I n t
   What’s missing?
Parsing as state transformation



   After we’ve parsed one Int, we might have more data in our
   String that we want to parse.
   How to represent this? Return the transformed state and the result
   in a tuple.
   s −> ( a , s )
   We accept an input state of type s, and return a transformed
   state, also of type s.
Parsing is composable



   Let’s give integer parsing a name:
   p a r s e D i g i t : : S t r i n g −> ( I n t , S t r i n g )
   How might we want to parse two digits?
   p a r s e T w o D i g i t s : : S t r i n g −> ( ( I n t , I n t ) , S t r i n g )
   parseTwoDigits s =
       let ( i , t ) = parseDigit s
               ( j , u) = parseDigit t
       in (( i , j ) , u)
Chaining parses more tidily

   It’s not good to represent the guts of our state explicitly using
   pairs:
       Tying ourselves to an implementation eliminates wiggle room.
   Here’s an alternative approach.
   newtype S t a t e s a = S t a t e {
         r u n S t a t e : : s −> ( a , s )
       }

       A newline declaration hides our implementation. It has no
       runtime cost.
       The runState function is a deconstructor: it exposes the
       underlying value.
Chaining parses



   Given a function that produces a result and a new state, we can
   “chain up” another function that accepts its result.
   c h a i n S t a t e s : : S t a t e s a −> ( a −> S t a t e s b ) −> S t a
   c h a i n S t a t e s m k = State chainFunc
      where c h a i n F u n c s =
                       let (a , t ) = runState m           s
                       in               runState (k a) t
   Notice that the result type is compatible with the input:
       We can chain uses of chainStates!
Injecting a pure value




   We’ll often want to leave the current state untouched, but inject a
   normal value that we can use when chaining.
   p u r e S t a t e : : a −> S t a t e s a
   p u r e S t a t e a = S t a t e $  s −> ( a , s )
What about computations that might fail?




   Try these in in ghci:

   Prelude> head [1,2,3]
   1
   Prelude> head []

   What gets printed in the second case?
One approach to potential failure



   The Prelude defines this handy standard type:
   data Maybe a = Just a
                | Nothing
   We can use it as follows:
   s a f e H e a d ( x : ) = Just x
   safeHead [ ]            = Nothing
   Save this in a source file, load it into ghci, and try it out.
Some familiar operations


   We can chain Maybe values:
   c h a i n M a y b e s : : Maybe a −> ( a −> Maybe b )
                         −> Maybe b
   c h a i n M a y b e s Nothing k = Nothing
   c h a i n M a y b e s ( Just x ) k = k x
   This gives us short circuiting if any computation in a chain fails:
       Maybe is the Ur-exception.
   We can also inject a pure value into a Maybe-typed computation:
   pureMaybe : : a −> Maybe a
   pureMaybe x = Just x
What do these types have in common?


   Chaining:
   chainMaybes : :       Maybe    a −> ( a −> Maybe b )
               −>        Maybe    b
   chainStates : :       State    s a −> ( a −> S t a t e s b )
               −>        State    s b
   Injection of a pure value:
   p u r e S t a t e : : a −> S t a t e s a
   pureMaybe : : a −> Maybe a

       Abstract away the type constructors, and these have identical
       types!
Monads



  More type-related pattern capture, courtesy of typeclasses:
  c l a s s Monad m where
     −− c h a i n
      (>>=) : : m a −> ( a −> m b ) −> m b

    −− i n j e c t a p u r e v a l u e
    r e t u r n : : a −> m a
Instances



   When a type is an instance of a typeclass, it supplies particular
   implementations of the typeclass’s functions:
   i n s t a n c e Monad Maybe where
       (>>=) = c h a i n M a y b e s
       r e t u r n = pureMaybe

   i n s t a n c e Monad ( S t a t e s ) where
       (>>=) = c h a i n S t a t e s
       return = pureState
Chaining with monads

   Using the methods of the Monad typeclass:
   parseThreeDigits            =
     p a r s e D i g i t >>=    a −>
     p a r s e D i g i t >>=   b −>
     p a r s e D i g i t >>=    c −>
     return (a , b , c )
   Syntactically sugared with do-notation:
   p a r s e T h r e e D i g i t s = do
      a <− p a r s e D i g i t
      b <− p a r s e D i g i t
      c <− p a r s e D i g i t
      return (a , b , c )
   This now looks suspiciously like imperative code.
Haven’t we forgotten something?




   What happens if we want to parse a digit out of a string that
   doesn’t contain any?
       We’d like to “break the chain” if a parse fails.
       We have this nice Maybe type for representing failure.
   Alas, we can’t combine the Maybe monad with the State monad.
       Different monads do not combine.
But this is awful! Don’t we need lots of boilerplate?




   Are we condemned to a world of numerous slightly tweaked custom
   monads?
   We can adapt the behaviour of an underlying monad.
   newtype MaybeT m a = MaybeT {
         runMaybeT : : m (Maybe a )
       }
Can we inject a pure value?




   pureMaybeT : : (Monad m) = a −> MaybeT m a
                              >
   pureMaybeT a = MaybeT ( r e t u r n ( Just a ) )
Can we write a chaining function?



   chainMaybeTs : : (Monad m) = MaybeT m a −> ( a −> Ma
                               >
                 −> MaybeT m b

   x ‘ chainMaybeTs ‘ f = MaybeT $ do
       unwrapped <− runMaybeT x
       case unwrapped o f
         Nothing −> r e t u r n Nothing
         Just y −> runMaybeT ( f y )
Making a Monad instance




  Given an underlying monad, we can stack a MaybeT on top of it
  and get a new monad.
   i n s t a n c e (Monad m) = Monad ( MaybeT m) where
                              >
       (>>=) = chainMaybeTs
       r e t u r n = pureMaybeT
A custom monad in 2 lines of code


   A parsing type that can short-circuit:
   {−# LANGUAGE G e n e r a l i z e d N e w t y p e D e r i v i n g #−}

   newtype MyParser a = MyP ( MaybeT ( S t a t e S t r i n g ) a )
     d e r i v i n g (Monad , MonadState S t r i n g )
   We use a GHC extension to automatically generate instances of
   non-H98 typeclasses:
        Monad
        MonadState String
What is MonadState?


  The State monad is parameterised over its underlying state, as
  State s:
       It knows nothing about the state, and cannot manipulate it.
  Instead, it implements an interface that lets us query and modify
  the state ourselves:
   c l a s s (Monad m) = MonadState s m
                            >
      −− q u e r y t h e c u r r e n t s t a t e
       get : : m s

     −− r e p l a c e t h e s t a t e w i t h a new one
     p u t : : s −> m ( )
Parsing text


   In essence:
       Get the current state, modify it, put the new state back.
   What do we do on failure?
   s t r i n g : : S t r i n g −> MyParser ( )
   s t r i n g s t r = do
       s <− g e t
       l e t ( hd , t l ) = s p l i t A t ( l e n g t h s t r ) s
       i f s t r == hd
           then p u t t l
            e l s e f a i l $ ” f a i l e d t o match ” ++ show s t r
Shipment of fail




   We’ve carefully hidden fail so far. Why?
       Many monads have a very bad definition: error.
   What’s the problem with error?
       It throws an exception that we can’t catch in pure code.
       It’s only safe to use in catastrophic cases.
Non-catastrophic failure



   A bread-and-butter activity in parsing is lookahead:
       Inspect the input stream and see what to do next
   JSON example:
       An object begins with “{”
       An array begins with “[”
   We look at the next input token to figure out what to do.
       If we fail to match “{”, it’s not an error.
       We just try “[” instead.
Giving ourselves alternatives




   We have two conflicting goals:
       We like to keep our implementation options open.
       Whether fail crashes depends on the underlying monad.
   We need a safer, abstract way to fail.
MonadPlus



  A typeclass with two methods:
  c l a s s Monad m = MonadPlus m where
                      >
     −− non−f a t a l f a i l u r e
      mzero : : m a

    −− i f t h e f i r s t a c t i o n f a i l s ,
    −− p e r f o r m t h e s e c o n d i n s t e a d
    mplus : : m a −> m a −> m a
  To upgrade our code, we replace our use of fail with mzero.
Writing a MonadZero instance

   We can easily make any stack of MaybeT atop another monad a
   MonadPlus:
   i n s t a n c e Monad m = MonadPlus ( MaybeT m) where
                            >
           mzero = MaybeT $ r e t u r n Nothing

        a ‘ mplus ‘ b = MaybeT $ do
          r e s u l t <− runMaybeT a
          case r e s u l t o f
              Just k −> r e t u r n ( Just k )
              Nothing −> runMaybeT b
   We simply add MonadPlus to the list of typeclasses we ask GHC
   to automatically derive for us.
Using MonadPlus

  Given functions that know how to parse bits of JSON:
  p a r s e O b j e c t : : MyParser [ ( S t r i n g , J S V a l u e ) ]
  p a r s e A r r a y : : MyParser [ J S V a l u e ]
  We can turn them into a coherent whole:
  parseJSON : : MyParser J S V a l u e
  parseJSON =
       ( p a r s e O b j e c t >>= o −> r e t u r n ( J S O b j e c t o ) )
    ‘ mplus ‘
       ( p a r s e A r r a y >>=  a −> r e t u r n ( J S A r r a y a ) )
    ‘ mplus ‘
       ...
The problem of boilerplate




   Here’s a repeated pattern from our parser:
   f o o >>=  x −> r e t u r n ( b a r x )
   These brief uses of variables, >>=, and return are redundant and
   burdensome.
   In fact, this pattern of applying a pure function to a monadic result
   is ubiquitous.
Boilerplate removal via lifting


   We replace this boilerplate with liftM:
   l i f t M : : Monad m = ( a −> b ) −> m a −> m b
                          >
   We refer to this as lifting a pure function into the monad.
   parseJSON =
        ( JSObject ‘ liftM ‘ parseObject )
     ‘ mplus ‘
        ( JSArray ‘ liftM ‘ parseArray )
   This style of programming looks less imperative, and more
   applicative.
The Parsec library




   Our motivation so far:
       Show you that it’s really easy to build a monadic parsing
       library
   But we must concede:
       Maybe you simply want to parse stuff
   Instead of rolling your own, use Daan Leijen’s Parsec library.
What to expect from Parsec



   It has some great advantages:
        A complete, concise EDSL for building parsers
        Easy to learn
        Produces useful error messages
   But it’s not perfect:
        Strict, so cannot parsing huge streams incrementally
        Based on String, hence slow
        Accepts, and chokes on, left-recursive grammars
Parsing a JSON string



   An example of Parsec’s concision:
   j s o n S t r i n g = between ( c h a r ’  ” ’ ) ( c h a r ’  ” ’ )
                         ( many j s o n C h a r )
   Some parsing combinators explained:
       between matches its 1st argument, then its 3rd, then its 2nd
       many runs a parser until it fails
       It returns a list of parse results
Parsing a character within a string


   j s o n C h a r = c h a r ’   ’ >> ( p e s c <|> p u n i )
           <|> s a t i s f y ( ‘ notElem ‘ ” ” ” )
   Between quotes, jsonChar matches a string’s body:
       A backslash must be followed by an escape (“n”) or Unicode
       (“u2fbe” )
       Any other character except “” or “”” is okay
   More combinator notes:
       The >> combinator is like >>=, but provides only
       sequencing, not binding
       The satisfy combinator uses a pure predicate.
Your turn!


   Write a parser for numbers. Here are some pieces you’ll need:
   import Numeric ( readFloat , readSigned )
   import Text . P a r s e r C o m b i n a t o r s . P a r s e c
   import C o n t r o l . Monad ( mzero )
   Other functions you’ll need:
        getInput
        setInput
   The type of your parser should look like this:
   parseNumber : : C h a r P a r s e r ( ) R a t i o n a l
Experimenting with your parser




   Simply load your code into ghci, and start playing:

   Prelude> :load MyParser
   *Main> parseTest parseNumber quot;3.14159quot;
My number parser




   parseNumber = do
     s <− g e t I n p u t
     case readSigned r e a d F l o a t s o f
        [ ( n , s ’ ) ] −> s e t I n p u t s ’ >> r e t u r n n
                        −> mzero
    <?> ” number ”
Using JSON in Haskell




   A good JSON package is already available from Hackage:
       http://tinyurl.com/hs-json
       The module is named Text.JSON
       Doesn’t use overlapping instances
Part 3




   This was going to be a concurrent web application, but I ran out
   of time.
         It’s still going to be informative and fun!
Concurrent programming




  The dominant programming model:
      Shared-state threads
      Locks for synchronization
      Condition variables for notification
The prehistory of threads




   Invented independently at least 3 times, circa 1965:
       Dijkstra
       Berkeley Timesharing System
       PL/I’s CALL XXX (A, B) TASK;
   Alas, the model has barely changed in almost half a century.
What does threading involve?




   Threads are a simple extension to sequential programming.
   All that we lose are the following:
       Understandability,
       Predictability, and
       Correctness
Concurrent Haskell



       Introduced in 1996, inspired by Id.
       Provides a forkIO action to create threads.
   The MVar type is the communication primitive:
       Atomically modifiable single-slot container
       Provides get and put operations
       An empty MVar blocks on get
       A full MVar blocks on put
   We can use MVars to build locks, semaphores, etc.
What’s wrong with MVars?



  MVars are no safer than the concurrency primitives of other
  languages.
      Deadlocks
      Data corruption
      Race conditions
  Higher order programming and phantom typing can help, but only
  a little.
The fundamental problem




   Given two correct concurrent program fragments:
       We cannot compose another correct concurrent fragment
       from them without great care.
Message passing is no panacea




   It brings its own difficulties:
        The programming model is demanding.
        Deadlock avoidance is hard.
        Debugging is really tough.
        Don’t forget coherence, scaling, atomicity, ...
Lock-free data structures




   A focus of much research in the 1990s.
       Modus operandi: find a new lock-free algorithm, earn a PhD.
       Tremendously difficult to get the code right.
       Neither a scalable or sustainable approach!
   This inspired research into hardware support, followed by:
       Software transactional memory
Software transactional memory




   The model is loosely similar to database programming:
       Start a transaction.
       Do lots of work.
       Either all changes succeed atomically...
       ...Or they all abort, again atomically.
   An aborted transaction is usually restarted.
The perils of STM



   STM code needs to be careful:
        Transactional code must not perform non-transactional
        actions.
        On abort-and-restart, there’s no way to roll back
        dropNukes()!
   In traditional languages, this is unenforceable.
        Programmers can innocently cause serious, hard-to-find bugs.
   Some hacks exist to help, e.g. tm callable annotations.
STM in Haskell




   In Haskell, the type system solves this problem for us.
       Recall that I/O actions have IO in their type signatures.
       STM actions have STM in their type signatures, but not IO.
       The type system statically prevents STM code from
       performing non-transactional actions!
Firing up a transaction




   As usual, we can explore APIs in ghci.
   The atomically action launches a transaction:

   Prelude> :m +Control.Concurrent.STM

   Prelude Control.Concurrent.STM> :type atomically
   atomically :: STM a -> IO a
Let’s build a game—World of Haskellcraft



   Our players love to have possessions.
   data I t e m = S c r o l l | Wand | Banjo
                  d e r i v i n g ( Eq , Ord , Show)

  −− i n v e n t o r y
  data I n v = I n v {
           i n v I t e m s : : [ Item ] ,
           invCapacity : : Int
       } d e r i v i n g ( Eq , Ord , Show)
Inventory manipulation


   Here’s how we set up mutable player inventory:
   import C o n t r o l . C o n c u r r e n t .STM

   type I n v e n t o r y = TVar I n v

   n e w I n v e n t o r y : : I n t −> IO I n v e n t o r y
   n e w I n v e n t o r y cap =
          newTVarIO I n v { i n v I t e m s = [ ] ,
                                     i n v C a p a c i t y = cap }
   The use of curly braces is called record syntax.
Inventory manipulation


   Here’s how we can add an item to a player’s inventory:
   a d d I t e m : : I t e m −> I n v e n t o r y −> STM ( )

   a d d I t e m i t e m i n v = do
       i <− readTVar i n v
       writeTVar inv i {
         i n v I t e m s = item : i n v I t e m s i
       }
   But wait a second:
       What about an inventory’s capacity?
       We don’t want our players to have infinitely deep pockets!
Checking capacity

   GHC defines a retry action that will abort and restart a
   transaction if it cannot succeed:
   i s F u l l : : I n v −> Bool
   i s F u l l ( I n v i t e m s cap ) = l e n g t h i t e m s == cap

   a d d I t e m i t e m i n v = do
       i <− readTVar i n v
      when ( i s F u l l i )
                retry
       writeTVar inv i {
         i n v I t e m s = item : i n v I t e m s i
       }
Let’s try it out




   Save the code in a file, and fire up ghci:

   *Main> i <- newInventory 3
   *Main> atomically (addItem Wand i)
   *Main> atomically (readTVar i)
   Inv {invItems = [Wand], invCapacity = 3}

   What happens if you repeat the addItem a few more times?
How does retry work?


   In principle, all the runtime has to do is retry the transaction
   immediately, and spin tightly until it succeeds.
        This might be correct, but it’s wasteful.
   What happens instead?
        The RTS tracks each mutable variable touched during a
        transaction.
        On retry, it blocks the transaction until at least one of those
        variables is modified.
   We haven’t told GHC what variables to wait on: it does this
   automatically!
Your turn!




   Write a function that removes an item from a player’s inventory:
   r e m o v e I t e m : : I t e m −> I n v e n t o r y −> STM ( )
My item removal action




   r e m o v e I t e m i t e m i n v = do
       i <− readTVar i n v
       case break (==i t e m ) ( i n v I t e m s i ) o f
          ( ,[])             −> r e t r y
          ( h , ( : t ) ) −> w r i t e T V a r i n v i {
                                   i n v I t e m s = h ++ t
                                 }
Your turn again!




   Write an action that lets us give an item from one player to
   another:
   g i v e I t e m : : I t e m −> I n v e n t o r y −> I n v e n t o r y
                   −> STM ( )
My solution




   g i v e I t e m i t e m a b = do
       removeItem item a
       addItem item b
What about that blocking?




   If we’re writing a game, we don’t want to block forever if a player’s
   inventory is full or empty.
       We’d like to say “you can’t do that right now”.
One approach to immediate failure


   Let’s call this the C programmer’s approach:
   a d d I t e m 1 : : I t e m −> TVar I n v −> STM Bool
   a d d I t e m 1 i t e m i n v = do
       i <− readTVar i n v
       if isFull i
           then r e t u r n F a l s e
            e l s e do
                    writeTVar inv i {
                      i n v I t e m s = item : i n v I t e m s i
                    }
                    r e t u r n True
What is the cost of this approach?




   If we have to check our results everywhere:
       The need for checking will spread
       Sadness will ensue
The Haskeller’s first loves




   We have some fondly held principles:
       Abstraction
       Composability
       Higher-order programming
   How can we apply these here?
A more abstract approach




   It turns out that the STM monad is a MonadPlus instance:
   i m m e d i a t e l y : : STM a −> STM (Maybe a )
   immediately act =
      ( Just ‘ l i f t M ‘ a c t ) ‘ mplus ‘ r e t u r n Nothing
What does mplus do in STM?




  This combinator is defined as orElse :
   o r E l s e : : STM a −> STM a −> STM a
  Given two transactions j and k:
      If transaction j must abort, perform transaction k instead.
A complicated specification




   We now have all the pieces we need to:
       Atomically give an item from one player to another.
       Fail immediately if the giver does not have it, or the recipient
       cannot accept it.
       Convert the result to a Bool.
Compositionality for the win



   Here’s how we glue the whole lot together:
   import Data . Maybe ( i s J u s t )

   giveItemNow : : I t e m −> I n v e n t o r y −> I n v e n t o r y
                         −> IO Bool
   giveItemNow i t e m a b =
      liftM isJust . atomically . immediately $
         r e m o v e I t e m i t e m a >> a d d I t e m i t e m b
   Even better, we can do all of this as nearly a one-liner!
Thank you!




  I hope you found this tutorial useful!
  Slide source available:
      http://tinyurl.com/defun08

Weitere ähnliche Inhalte

Was ist angesagt?

Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Bryan O'Sullivan
 
Introduction to Python - Part Two
Introduction to Python - Part TwoIntroduction to Python - Part Two
Introduction to Python - Part Twoamiable_indian
 
Practical Functional Programming Presentation by Bogdan Hodorog
Practical Functional Programming Presentation by Bogdan HodorogPractical Functional Programming Presentation by Bogdan Hodorog
Practical Functional Programming Presentation by Bogdan Hodorog3Pillar Global
 
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Philip Schwarz
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsPython Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsRanel Padon
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2Philip Schwarz
 
Haskell retrospective
Haskell retrospectiveHaskell retrospective
Haskell retrospectivechenge2k
 
Introduction to functional programming (In Arabic)
Introduction to functional programming (In Arabic)Introduction to functional programming (In Arabic)
Introduction to functional programming (In Arabic)Omar Abdelhafith
 
The Sincerest Form of Flattery
The Sincerest Form of FlatteryThe Sincerest Form of Flattery
The Sincerest Form of FlatteryJosé Paumard
 
Introduction to Python - Part Three
Introduction to Python - Part ThreeIntroduction to Python - Part Three
Introduction to Python - Part Threeamiable_indian
 
A brief introduction to lisp language
A brief introduction to lisp languageA brief introduction to lisp language
A brief introduction to lisp languageDavid Gu
 
The Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and FoldThe Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and FoldPhilip Schwarz
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...Philip Schwarz
 
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...Philip Schwarz
 

Was ist angesagt? (20)

Real World Haskell: Lecture 2
Real World Haskell: Lecture 2Real World Haskell: Lecture 2
Real World Haskell: Lecture 2
 
Introduction to Python - Part Two
Introduction to Python - Part TwoIntroduction to Python - Part Two
Introduction to Python - Part Two
 
Practical Functional Programming Presentation by Bogdan Hodorog
Practical Functional Programming Presentation by Bogdan HodorogPractical Functional Programming Presentation by Bogdan Hodorog
Practical Functional Programming Presentation by Bogdan Hodorog
 
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
Function Applicative for Great Good of Palindrome Checker Function - Polyglot...
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular ExpressionsPython Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular Expressions
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala - Part 2
 
Strings in python
Strings in pythonStrings in python
Strings in python
 
Haskell retrospective
Haskell retrospectiveHaskell retrospective
Haskell retrospective
 
Python
PythonPython
Python
 
Introduction to functional programming (In Arabic)
Introduction to functional programming (In Arabic)Introduction to functional programming (In Arabic)
Introduction to functional programming (In Arabic)
 
The Sincerest Form of Flattery
The Sincerest Form of FlatteryThe Sincerest Form of Flattery
The Sincerest Form of Flattery
 
Introduction to Python - Part Three
Introduction to Python - Part ThreeIntroduction to Python - Part Three
Introduction to Python - Part Three
 
A brief introduction to lisp language
A brief introduction to lisp languageA brief introduction to lisp language
A brief introduction to lisp language
 
The Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and FoldThe Functional Programming Triad of Map, Filter and Fold
The Functional Programming Triad of Map, Filter and Fold
 
Python ppt
Python pptPython ppt
Python ppt
 
OOP and FP
OOP and FPOOP and FP
OOP and FP
 
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...
Folding Unfolded - Polyglot FP for Fun and Profit - Haskell and Scala Part 2 ...
 
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
N-Queens Combinatorial Problem - Polyglot FP for Fun and Profit – Haskell and...
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
 

Ähnlich wie DEFUN 2008 - Real World Haskell

JavaScript: Core Part
JavaScript: Core PartJavaScript: Core Part
JavaScript: Core Part維佋 唐
 
Getting started with c++
Getting started with c++Getting started with c++
Getting started with c++K Durga Prasad
 
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docxCS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docxfaithxdunce63732
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginnersAbishek Purushothaman
 
Generic Programming seminar
Generic Programming seminarGeneric Programming seminar
Generic Programming seminarGautam Roy
 
Lecture 3 getting_started_with__c_
Lecture 3 getting_started_with__c_Lecture 3 getting_started_with__c_
Lecture 3 getting_started_with__c_eShikshak
 
It’s sometimes useful to make a little language for a simple problem.pdf
It’s sometimes useful to make a little language for a simple problem.pdfIt’s sometimes useful to make a little language for a simple problem.pdf
It’s sometimes useful to make a little language for a simple problem.pdfarri2009av
 
Can't Dance The Lambda
Can't Dance The LambdaCan't Dance The Lambda
Can't Dance The LambdaTogakangaroo
 
Java script final presentation
Java script final presentationJava script final presentation
Java script final presentationAdhoura Academy
 
Convention-Based Syntactic Descriptions
Convention-Based Syntactic DescriptionsConvention-Based Syntactic Descriptions
Convention-Based Syntactic DescriptionsRay Toal
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
Tech Days Paris Intoduction F# and Collective Intelligence
Tech Days Paris Intoduction F# and Collective IntelligenceTech Days Paris Intoduction F# and Collective Intelligence
Tech Days Paris Intoduction F# and Collective IntelligenceRobert Pickering
 

Ähnlich wie DEFUN 2008 - Real World Haskell (20)

JavaScript: Core Part
JavaScript: Core PartJavaScript: Core Part
JavaScript: Core Part
 
Getting started with c++
Getting started with c++Getting started with c++
Getting started with c++
 
Getting started with c++
Getting started with c++Getting started with c++
Getting started with c++
 
python and perl
python and perlpython and perl
python and perl
 
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docxCS 23001 Computer Science II Data Structures & AbstractionPro.docx
CS 23001 Computer Science II Data Structures & AbstractionPro.docx
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginners
 
Generic Programming seminar
Generic Programming seminarGeneric Programming seminar
Generic Programming seminar
 
7986-lect 7.pdf
7986-lect 7.pdf7986-lect 7.pdf
7986-lect 7.pdf
 
Lecture 3 getting_started_with__c_
Lecture 3 getting_started_with__c_Lecture 3 getting_started_with__c_
Lecture 3 getting_started_with__c_
 
It’s sometimes useful to make a little language for a simple problem.pdf
It’s sometimes useful to make a little language for a simple problem.pdfIt’s sometimes useful to make a little language for a simple problem.pdf
It’s sometimes useful to make a little language for a simple problem.pdf
 
C tutorial
C tutorialC tutorial
C tutorial
 
C tutorial
C tutorialC tutorial
C tutorial
 
C tutorial
C tutorialC tutorial
C tutorial
 
Java 8
Java 8Java 8
Java 8
 
Can't Dance The Lambda
Can't Dance The LambdaCan't Dance The Lambda
Can't Dance The Lambda
 
Java script final presentation
Java script final presentationJava script final presentation
Java script final presentation
 
C Tutorials
C TutorialsC Tutorials
C Tutorials
 
Convention-Based Syntactic Descriptions
Convention-Based Syntactic DescriptionsConvention-Based Syntactic Descriptions
Convention-Based Syntactic Descriptions
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
Tech Days Paris Intoduction F# and Collective Intelligence
Tech Days Paris Intoduction F# and Collective IntelligenceTech Days Paris Intoduction F# and Collective Intelligence
Tech Days Paris Intoduction F# and Collective Intelligence
 

Mehr von Bryan O'Sullivan

Real World Haskell: Lecture 5
Real World Haskell: Lecture 5Real World Haskell: Lecture 5
Real World Haskell: Lecture 5Bryan O'Sullivan
 
Real World Haskell: Lecture 4
Real World Haskell: Lecture 4Real World Haskell: Lecture 4
Real World Haskell: Lecture 4Bryan O'Sullivan
 
CUFP 2009 Keynote - Real World Haskell
CUFP 2009 Keynote - Real World HaskellCUFP 2009 Keynote - Real World Haskell
CUFP 2009 Keynote - Real World HaskellBryan O'Sullivan
 
The other side of functional programming: Haskell for Erlang people
The other side of functional programming: Haskell for Erlang peopleThe other side of functional programming: Haskell for Erlang people
The other side of functional programming: Haskell for Erlang peopleBryan O'Sullivan
 
Haskell for the Real World
Haskell for the Real WorldHaskell for the Real World
Haskell for the Real WorldBryan O'Sullivan
 

Mehr von Bryan O'Sullivan (6)

Pronk like you mean it
Pronk like you mean itPronk like you mean it
Pronk like you mean it
 
Real World Haskell: Lecture 5
Real World Haskell: Lecture 5Real World Haskell: Lecture 5
Real World Haskell: Lecture 5
 
Real World Haskell: Lecture 4
Real World Haskell: Lecture 4Real World Haskell: Lecture 4
Real World Haskell: Lecture 4
 
CUFP 2009 Keynote - Real World Haskell
CUFP 2009 Keynote - Real World HaskellCUFP 2009 Keynote - Real World Haskell
CUFP 2009 Keynote - Real World Haskell
 
The other side of functional programming: Haskell for Erlang people
The other side of functional programming: Haskell for Erlang peopleThe other side of functional programming: Haskell for Erlang people
The other side of functional programming: Haskell for Erlang people
 
Haskell for the Real World
Haskell for the Real WorldHaskell for the Real World
Haskell for the Real World
 

Kürzlich hochgeladen

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Kürzlich hochgeladen (20)

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

DEFUN 2008 - Real World Haskell

  • 1. Real World Haskell Bryan O’Sullivan bos@serpentine.com 2008-09-27
  • 2. Welcome! A few things to expect about this tutorial: The pace will be rapid Stop me and ask questions—early and often I assume no prior Haskell exposure
  • 3. A little bit about Haskell Haskell is a multi-paradigm language. It chooses some unusual, but principled, defaults: Pure functions Non-strict evaluation Immutable data Static, strong typing Why default to these behaviours? We want our code to be safe, modular, and tractable.
  • 4. Pure functions Definition The result of a pure function depends only on its visible inputs: Given identical inputs, it always computes the same result. It has no other observable effects. What are some consequences of this? Modularity leads to simplified reasoning about behaviour. Straightforward testing: no need for elaborate frameworks.
  • 5. Immutable data Definition Data is immutable (or purely functional) if it is never modified after construction. To “modify” a value, we create a new value. Both new and old versions can coexist afterwards, so we get persistent, versioned data for free. Modification is often easier than with mutable data. In multithreaded code, we do away with much elaborate locking.
  • 6. Static, strong typing Definition A program is statically typed if we know the type of every expression before the program is run. Definition Code is strongly typed if the absence of certain classes of error can be proven statically.
  • 7. Safety, modularity, and tractability Safety: As few nasty surprises at runtime as possible. Static typing and eased testing give us confidence. Modularity: We can build big pieces of code from smaller components. No need to focus on the details of the smaller parts. Tractability: All of this fits in our brain comfortably... ...leaving plenty of room for the application we care about.
  • 8. GHC, the Glorious Glasgow Haskell Compiler Have you got GHC yet? Download installer for Windows, OS X, or Linux here: http://www.haskell.org/ghc/download_ghc_683.html
  • 9. What’s special about GHC? Mature, portable, optimising compiler Great tools: interactive shell and debugger time and space profilers code coverage analyser BSD-licensed, hence suitable for OSS and commercial use
  • 10. Counting lines The classic Unix wc command counts the lines in some files: $ time wc -l *.fasta 9975 1000-Rn_EST.fasta 14032 chr18.fasta 14005 chr19.fasta 13980 chr20.fasta 42017 chr_all.fasta 94009 total real 0m0.017s
  • 11. Breaking the problem down Subproblems to consider: Get our command line arguments Read a file Split it into lines Count the lines Let’s work through these in reverse order.
  • 12. Type signatures Definition A type signature describes the type of a Haskell expression: e : : Double We read :: as “left has the type right”. So “e has the type Double”. Here’s the accompanying definition: e = 2.7182818
  • 13. Type signatures are optional In Haskell, most type signatures are optional. The compiler can automatically infer types based on our usage. Why write type signatures at all, then? Mostly as useful documentation to ourselves.
  • 14. GHC’s interactive interpreter GHC includes an interactive expression evaluator, ghci. Run it from a terminal window or command prompt: $ ghci GHCi, version 6.8.3: http://www.haskell.org/ghc/ :? for help Loading package base ... linking ... done. Prelude> The Prelude> text is ghci’s prompt. Type :? at the prompt to get (terse) help.
  • 15. Basic interaction Let’s enter some expressions: Prelude> 2 + 2 4 Prelude> True && False False We can find out about types: Prelude> :type True True :: Bool
  • 16. Writing a list Here’s an empty list: Prelude> [] [] What do we need to create a longer list? A value An existing list Some glue—the : operator Prelude> 1:[] [1] Prelude> 1:2:[] [1,2]
  • 17. Syntactic sugar for lists What’s the difference between these? 1:2:[] [1,2] Nothing—the latter is purely a notational convenience.
  • 18. Characters and strings One character: Prelude> :type ’a’ ’a’ :: Char A string is a list of characters: Prelude> ’a’ : ’b’ : [] quot;abquot; Notation: Single quotes for one Char Double quotes for a string (written [Char])
  • 19. Function application We apply a function to its arguments by juxtaposition: Prelude> length [2,4,6] 3 Prelude> take 2 [3,6,9,12] [3,6] Why refer to this as application, instead of the more familiar calling? Haskell is a non-strict language The result may not be computed immediately
  • 20. Lists are inductive Haskell lists are defined inductively. A list can be one of two things: An empty list A value in front of an existing list We call our friends [] and : value constructors: They construct values that have the type “list of something.”
  • 21. Counting lines Haskell programmers love abstraction. We won’t worry about counting lines. Instead, we’ll count the elements in any kind of list.
  • 22. The type signature of a function How do we describe a function that computes the length of a list? l e n : : [ a ] −> I n t e g e r The −> notation denotes a function. The function accepts an [a], and returns an Integer. What’s an [a]? A list, whose elements must all be of some type a.
  • 23. Counting by induction: the base case An empty list has the length zero. len [ ] = 0 This is our first example of pattern matching. Our function accepts one argument. If the argument is an empty list, we return zero. We call this the base case.
  • 24. Counting by induction: the inductive case Let’s see if a list value was created using the : constructor. len ( x : xs ) = 1 + len xs If the pattern match succeeds: The name x is bound to the head of the list. The name xs is bound to the tail of the list. The body of the definition is used as the result.
  • 25. The complete function Save this in a file named Length.hs: l e n : : [ a ] −> I n t e g e r len [ ] = 0 len ( x : xs ) = 1 + len xs
  • 26. Load the file into ghci In the same directory, run ghci: Prelude> :load Length [1 of 1] Compiling Main ( Length.hs, interprete Ok, modules loaded: Main. *Main> The ghci prompt changes when we load files. Let’s try out our function: *Main> len [] 0 *Main> len (1:[]) 1 *Main> len [4,5,6] 3
  • 27. Generating a list from a list How might we double every other element of a list? double ( a : b : cs ) = a : b ∗ 2 : double cs double cs = cs Save this in a file named Double.hs. Load the file into ghci. Try the following expressions: [1..10] double [1..10]
  • 28. Your turn: axpy The classic Linpack function axpy computes a × xi + yi over a scalar a and each element i of two vectors x and y . Define it over two lists of numbers in Haskell. How do we handle lists of different lengths?
  • 29. Splitting text on line boundaries Haskell provides a large library of built-in functions, the Prelude. Here’s the Prelude’s function for splitting text by lines: l i n e s : : S t r i n g −> [ S t r i n g ] The type String is a synonym for [Char]. A ghci experiment: *Main> lines quot;foonbarnquot; [quot;fooquot;,quot;barquot;] *Main> len (lines quot;foonbarnquot;) 2
  • 30. Reading a file To read a file, we use the Prelude’s readFile function: *Main> :type readFile readFile :: FilePath -> IO String What’s this signature mean? The FilePath type is just a synonym for String. The type IO String means here be dragons! A signature that ends in IO something can have externally visible side effects. Here, the side effect is “read the contents of a file”.
  • 31. Side effects That innocuous IO in the type is a big deal. We can tell by its type signature whether a value might have externally visible effects. If a type does not include IO, it cannot: Read files Make network connections Launch torpedoes The ideal is for most code to not have an IO type.
  • 32. Counting lines in a file If we invoke code that has side effects, our code must by implication have side effects too. c o u n t L i n e s : : F i l e P a t h −> IO I n t e g e r c o u n t L i n e s p a t h = do c o n t e n t s <− r e a d F i l e p a t h return ( len ( l i n e s contents )) We had to add IO to our type here because we use readFile, which has side effects. Add this code to Length.hs.
  • 33. A few explanations The <− notation means “perform the action on the right, and assign the result to the name on the left.” name <− a c t i o n The return function takes a pure value, and (here) adds IO to its type.
  • 34. Command line arguments We use getArgs to obtain command line arguments. import System . E n v i r o n m e n t ( getArgs ) main = do a r g s <− getArgs putStrLn ( ” h e l l o , a r g s a r e ” ++ show a r g s ) What’s new here? The import directive imports the name getArgs from the System.Environment module. The ++ operator concatenates two lists.
  • 35. Pattern matching in an expression We use case to pattern match inside an expression. −− Does l i s t c o n t a i n two o r more e l e m e n t s ? atLeastTwo m y L i s t = case m y L i s t o f ( a : b : c s ) −> True −> F a l s e The expression between case and of is matched in turn against each pattern, until one matches.
  • 36. Irrefutable and wild card patterns A pattern usually matches against a value’s constructors. In other words, it inspects the structure of the value. A simple pattern, e.g. a plain name like a, contains no constructors. It thus matches any value. Definition A pattern that always matches any value is called irrefutable. The special wild card pattern is irrefutable, but does not bind a value to a name.
  • 37. Tuples A tuple is a fixed-size collection of values. Items in a tuple can have different types. Example: (True,”foo”) This has the type (Bool,String) Contrast tuples with lists, to see why we’d want both: A list is a variable-sized collection of values. Each value in a list must have the same type. Example: [True, False]
  • 38. The zip function What does the zip function do? Adventures in function discovery, courtesy of ghci: Start by inspecting its type, using :type. Try it with one set of inputs. Then try with another.
  • 39. Making our program runnable Add the following code to Length.hs: main = do −− E x e r c i s e : g e t t h e command l i n e a r g u m e n t s l e n g t h s <− mapM c o u n t L i n e s a r g s mapM p r i n t L e n g t h ( z i p a r g s l e n g t h s ) case a r g s o f ( : : ) −> p r i n t L e n g t h ( ” t o t a l ” , sum l e n g t h s ) −> r e t u r n ( ) Don’t forget to add an import directive at the beginning!
  • 40. The mapM function This function applies an action to a list of arguments in turn, and returns the list of results. The mapM function is similar, but returns the value (), aka unit (“nothing”). The mapM function is useful for the effects it causes, e.g. printing every element of a list.
  • 41. Write your own printLength function Hint: we’ve seen a similar example already, with our getArgs example.
  • 42. Compiling your program It’s easy to compile a program with GHC: $ ghc --make Length What does the compiler do? Looks for a source file named Length.hs. Compiles it to native code. Generates an executable named Length.
  • 43. Running our program Here’s an example from my laptop: $ time ./Length *.fasta 1000-Rn_EST.fasta 9975 chr18.fasta 14032 chr19.fasta 14005 chr20.fasta 13980 chr_all.fasta 42017 total 94009 real 0m1.533s Oh, no! Look at that performance! 90 times slower than wc
  • 44. Faster file processing Lists are wonderful to work with But they exact a huge performance toll The current best-of-breed alternative for file data: ByteString
  • 45. What is a ByteString? They come in two flavours: Strict: a single packed array of bytes Lazy: a list of 64KB strict chunks Each flavour provides a list-like API.
  • 46. Retooling our word count program All we do is add an import and change one function: import q u a l i f i e d Data . B y t e S t r i n g . Lazy . Char8 a s B c o u n t L i n e s p a t h = do c o n t e n t s <− B . r e a d F i l e p a t h r e t u r n ( l e n g t h (B . l i n e s c o n t e n t s ) ) The “B.” prefixes make us pick up the readFile and lines functions from the bytestring package.
  • 47. What happens to performance? Haskell lists: 1.533 seconds Lazy ByteString: 0.022 seconds wc command: 0.015 seconds Given the tiny data set size, C and Haskell are in a dead heat.
  • 48. When to use ByteStrings? Any time you deal with binary data For text, only if you’re sure it’s 8-bit clean For i18n needs, fast packed Unicode is under development. Great open source libraries that use ByteStrings: binary—parsing/generation of binary data zlib and bzlib—support for popular compression/decompression formats attoparsec—parse text-based files and network protocols
  • 50. A little bit about JSON A popular interchange format for structured data: simpler than XML, and widely supported. Basic types: Number String Boolean Null Derived types: Object: unordered name/value map Array: ordered collection of values
  • 51. JSON at work: Twitter’s search API From http://search.twitter.com/search.json?q=haskell: {quot;textquot;: quot;Why Haskell? Easiest way to be productivequot;, quot;to_user_idquot;: null, quot;from_userquot;: quot;galoisincquot;, quot;idquot;: 936114469, quot;from_user_idquot;: 1633746, quot;iso_language_codequot;: quot;enquot;, quot;created_atquot;:quot;Fri, 26 Sep 2008 19:15:35 +0000quot;}
  • 52. JSON in Haskell data J S V a l u e = JSNull | JSBool ! Bool | JSRational ! Rational | JSString JSString | JSArray [ JSValue ] | JSObject ( JSObject JSValue )
  • 53. What is a JSString? We hide the underlying use of a String: newtype J S S t r i n g = JSONString { f r o m J S S t r i n g : : S t o J S S t r i n g : : S t r i n g −> J S S t r i n g t o J S S t r i n g = JSONString We do the same with JSON objects: newtype J S O b j e c t a = JSONObject { f r o m J S O b j e c t : : [ ( t o J S O b j e c t : : [ ( S t r i n g , a ) ] −> J S O b j e c t a t o J S O b j e c t = JSONObject
  • 54. JSON conversion In Haskell, we capture type-dependent patterns using typeclasses: The class of types whose values can be converted to and from JSON data R e s u l t a = Ok a | E r r o r S t r i n g c l a s s JSON a where readJSON : : J S V a l u e −> R e s u l t a showJSON : : a −> J S V a l u e
  • 55. Why JSString, JSObject, and JSArray? Haskell typeclasses give us an open world: We can declare a type to be an instance of a class at any time In fact, we cannot declare the number of instances to be fixed If we left the String type “naked”, what could happen? Someone might declare Char to be an instance of JSON What if someone declared a JSON a =>JSON [a] instance? This is the overlapping instances problem.
  • 56. Relaxing the overlapping instances restriction By default, GHC is conservative: It rejects overlapping instances outright We can get it to loosen up a bit via a pragma: {−# LANGUAGE O v e r l a p p i n g I n s t a n c e s #−} If it finds one most specific instance, it will use it, otherwise bail as before.
  • 57. Bool as JSON Here’s a simple way to declare the Bool type as an instance of the JSON class: i n s t a n c e JSON Bool where showJSON = JSBool readJSON ( JSBool b ) = Ok b readJSON = E r r o r ” Bool p a r s e f a i l e d ” This has a design problem: We’ve plumbed our Result type straight in If we want to change its implementation, it will be painful
  • 58. Hiding the plumbing A simple (but good enough!) approach to abstraction: s u c c e s s : : a −> R e s u l t a s u c c e s s k = Ok k f a i l u r e : : S t r i n g −> R e s u l t a f a i l u r e errMsg = E r r o r errMsg Functions like these are sometimes called “smart constructors”.
  • 59. Does this affect our code much? We simply replace the explicit constructors with the functions we just defined: i n s t a n c e JSON Bool where showJSON = JSBool readJSON ( JSBool b ) = success b readJSON = f a i l u r e ” Bool p a r s e f a i l e d ”
  • 60. JSON input and output We can now convert between normal Haskell values and our JSON representation. But... ...we still need to be able to transmit this stuff over the wire. Which is more fun to mull over? Parsing!
  • 61. A functional view of parsing Here’s a super-simple perspective: Take a piece of data (usually a sequence) Try to apply an interpretation to it How might we represent this?
  • 62. A basic type signature for parsing Take two type variables, i.e. placeholders for types that we’ll substitute later: s—the state (data) we want to parse a—the type of its interpretation We get this generic type signature: s −> a Let’s make the task more concrete: Parse a String as an Int S t r i n g −> I n t What’s missing?
  • 63. Parsing as state transformation After we’ve parsed one Int, we might have more data in our String that we want to parse. How to represent this? Return the transformed state and the result in a tuple. s −> ( a , s ) We accept an input state of type s, and return a transformed state, also of type s.
  • 64. Parsing is composable Let’s give integer parsing a name: p a r s e D i g i t : : S t r i n g −> ( I n t , S t r i n g ) How might we want to parse two digits? p a r s e T w o D i g i t s : : S t r i n g −> ( ( I n t , I n t ) , S t r i n g ) parseTwoDigits s = let ( i , t ) = parseDigit s ( j , u) = parseDigit t in (( i , j ) , u)
  • 65. Chaining parses more tidily It’s not good to represent the guts of our state explicitly using pairs: Tying ourselves to an implementation eliminates wiggle room. Here’s an alternative approach. newtype S t a t e s a = S t a t e { r u n S t a t e : : s −> ( a , s ) } A newline declaration hides our implementation. It has no runtime cost. The runState function is a deconstructor: it exposes the underlying value.
  • 66. Chaining parses Given a function that produces a result and a new state, we can “chain up” another function that accepts its result. c h a i n S t a t e s : : S t a t e s a −> ( a −> S t a t e s b ) −> S t a c h a i n S t a t e s m k = State chainFunc where c h a i n F u n c s = let (a , t ) = runState m s in runState (k a) t Notice that the result type is compatible with the input: We can chain uses of chainStates!
  • 67. Injecting a pure value We’ll often want to leave the current state untouched, but inject a normal value that we can use when chaining. p u r e S t a t e : : a −> S t a t e s a p u r e S t a t e a = S t a t e $ s −> ( a , s )
  • 68. What about computations that might fail? Try these in in ghci: Prelude> head [1,2,3] 1 Prelude> head [] What gets printed in the second case?
  • 69. One approach to potential failure The Prelude defines this handy standard type: data Maybe a = Just a | Nothing We can use it as follows: s a f e H e a d ( x : ) = Just x safeHead [ ] = Nothing Save this in a source file, load it into ghci, and try it out.
  • 70. Some familiar operations We can chain Maybe values: c h a i n M a y b e s : : Maybe a −> ( a −> Maybe b ) −> Maybe b c h a i n M a y b e s Nothing k = Nothing c h a i n M a y b e s ( Just x ) k = k x This gives us short circuiting if any computation in a chain fails: Maybe is the Ur-exception. We can also inject a pure value into a Maybe-typed computation: pureMaybe : : a −> Maybe a pureMaybe x = Just x
  • 71. What do these types have in common? Chaining: chainMaybes : : Maybe a −> ( a −> Maybe b ) −> Maybe b chainStates : : State s a −> ( a −> S t a t e s b ) −> State s b Injection of a pure value: p u r e S t a t e : : a −> S t a t e s a pureMaybe : : a −> Maybe a Abstract away the type constructors, and these have identical types!
  • 72. Monads More type-related pattern capture, courtesy of typeclasses: c l a s s Monad m where −− c h a i n (>>=) : : m a −> ( a −> m b ) −> m b −− i n j e c t a p u r e v a l u e r e t u r n : : a −> m a
  • 73. Instances When a type is an instance of a typeclass, it supplies particular implementations of the typeclass’s functions: i n s t a n c e Monad Maybe where (>>=) = c h a i n M a y b e s r e t u r n = pureMaybe i n s t a n c e Monad ( S t a t e s ) where (>>=) = c h a i n S t a t e s return = pureState
  • 74. Chaining with monads Using the methods of the Monad typeclass: parseThreeDigits = p a r s e D i g i t >>= a −> p a r s e D i g i t >>= b −> p a r s e D i g i t >>= c −> return (a , b , c ) Syntactically sugared with do-notation: p a r s e T h r e e D i g i t s = do a <− p a r s e D i g i t b <− p a r s e D i g i t c <− p a r s e D i g i t return (a , b , c ) This now looks suspiciously like imperative code.
  • 75. Haven’t we forgotten something? What happens if we want to parse a digit out of a string that doesn’t contain any? We’d like to “break the chain” if a parse fails. We have this nice Maybe type for representing failure. Alas, we can’t combine the Maybe monad with the State monad. Different monads do not combine.
  • 76. But this is awful! Don’t we need lots of boilerplate? Are we condemned to a world of numerous slightly tweaked custom monads? We can adapt the behaviour of an underlying monad. newtype MaybeT m a = MaybeT { runMaybeT : : m (Maybe a ) }
  • 77. Can we inject a pure value? pureMaybeT : : (Monad m) = a −> MaybeT m a > pureMaybeT a = MaybeT ( r e t u r n ( Just a ) )
  • 78. Can we write a chaining function? chainMaybeTs : : (Monad m) = MaybeT m a −> ( a −> Ma > −> MaybeT m b x ‘ chainMaybeTs ‘ f = MaybeT $ do unwrapped <− runMaybeT x case unwrapped o f Nothing −> r e t u r n Nothing Just y −> runMaybeT ( f y )
  • 79. Making a Monad instance Given an underlying monad, we can stack a MaybeT on top of it and get a new monad. i n s t a n c e (Monad m) = Monad ( MaybeT m) where > (>>=) = chainMaybeTs r e t u r n = pureMaybeT
  • 80. A custom monad in 2 lines of code A parsing type that can short-circuit: {−# LANGUAGE G e n e r a l i z e d N e w t y p e D e r i v i n g #−} newtype MyParser a = MyP ( MaybeT ( S t a t e S t r i n g ) a ) d e r i v i n g (Monad , MonadState S t r i n g ) We use a GHC extension to automatically generate instances of non-H98 typeclasses: Monad MonadState String
  • 81. What is MonadState? The State monad is parameterised over its underlying state, as State s: It knows nothing about the state, and cannot manipulate it. Instead, it implements an interface that lets us query and modify the state ourselves: c l a s s (Monad m) = MonadState s m > −− q u e r y t h e c u r r e n t s t a t e get : : m s −− r e p l a c e t h e s t a t e w i t h a new one p u t : : s −> m ( )
  • 82. Parsing text In essence: Get the current state, modify it, put the new state back. What do we do on failure? s t r i n g : : S t r i n g −> MyParser ( ) s t r i n g s t r = do s <− g e t l e t ( hd , t l ) = s p l i t A t ( l e n g t h s t r ) s i f s t r == hd then p u t t l e l s e f a i l $ ” f a i l e d t o match ” ++ show s t r
  • 83. Shipment of fail We’ve carefully hidden fail so far. Why? Many monads have a very bad definition: error. What’s the problem with error? It throws an exception that we can’t catch in pure code. It’s only safe to use in catastrophic cases.
  • 84. Non-catastrophic failure A bread-and-butter activity in parsing is lookahead: Inspect the input stream and see what to do next JSON example: An object begins with “{” An array begins with “[” We look at the next input token to figure out what to do. If we fail to match “{”, it’s not an error. We just try “[” instead.
  • 85. Giving ourselves alternatives We have two conflicting goals: We like to keep our implementation options open. Whether fail crashes depends on the underlying monad. We need a safer, abstract way to fail.
  • 86. MonadPlus A typeclass with two methods: c l a s s Monad m = MonadPlus m where > −− non−f a t a l f a i l u r e mzero : : m a −− i f t h e f i r s t a c t i o n f a i l s , −− p e r f o r m t h e s e c o n d i n s t e a d mplus : : m a −> m a −> m a To upgrade our code, we replace our use of fail with mzero.
  • 87. Writing a MonadZero instance We can easily make any stack of MaybeT atop another monad a MonadPlus: i n s t a n c e Monad m = MonadPlus ( MaybeT m) where > mzero = MaybeT $ r e t u r n Nothing a ‘ mplus ‘ b = MaybeT $ do r e s u l t <− runMaybeT a case r e s u l t o f Just k −> r e t u r n ( Just k ) Nothing −> runMaybeT b We simply add MonadPlus to the list of typeclasses we ask GHC to automatically derive for us.
  • 88. Using MonadPlus Given functions that know how to parse bits of JSON: p a r s e O b j e c t : : MyParser [ ( S t r i n g , J S V a l u e ) ] p a r s e A r r a y : : MyParser [ J S V a l u e ] We can turn them into a coherent whole: parseJSON : : MyParser J S V a l u e parseJSON = ( p a r s e O b j e c t >>= o −> r e t u r n ( J S O b j e c t o ) ) ‘ mplus ‘ ( p a r s e A r r a y >>= a −> r e t u r n ( J S A r r a y a ) ) ‘ mplus ‘ ...
  • 89. The problem of boilerplate Here’s a repeated pattern from our parser: f o o >>= x −> r e t u r n ( b a r x ) These brief uses of variables, >>=, and return are redundant and burdensome. In fact, this pattern of applying a pure function to a monadic result is ubiquitous.
  • 90. Boilerplate removal via lifting We replace this boilerplate with liftM: l i f t M : : Monad m = ( a −> b ) −> m a −> m b > We refer to this as lifting a pure function into the monad. parseJSON = ( JSObject ‘ liftM ‘ parseObject ) ‘ mplus ‘ ( JSArray ‘ liftM ‘ parseArray ) This style of programming looks less imperative, and more applicative.
  • 91. The Parsec library Our motivation so far: Show you that it’s really easy to build a monadic parsing library But we must concede: Maybe you simply want to parse stuff Instead of rolling your own, use Daan Leijen’s Parsec library.
  • 92. What to expect from Parsec It has some great advantages: A complete, concise EDSL for building parsers Easy to learn Produces useful error messages But it’s not perfect: Strict, so cannot parsing huge streams incrementally Based on String, hence slow Accepts, and chokes on, left-recursive grammars
  • 93. Parsing a JSON string An example of Parsec’s concision: j s o n S t r i n g = between ( c h a r ’ ” ’ ) ( c h a r ’ ” ’ ) ( many j s o n C h a r ) Some parsing combinators explained: between matches its 1st argument, then its 3rd, then its 2nd many runs a parser until it fails It returns a list of parse results
  • 94. Parsing a character within a string j s o n C h a r = c h a r ’ ’ >> ( p e s c <|> p u n i ) <|> s a t i s f y ( ‘ notElem ‘ ” ” ” ) Between quotes, jsonChar matches a string’s body: A backslash must be followed by an escape (“n”) or Unicode (“u2fbe” ) Any other character except “” or “”” is okay More combinator notes: The >> combinator is like >>=, but provides only sequencing, not binding The satisfy combinator uses a pure predicate.
  • 95. Your turn! Write a parser for numbers. Here are some pieces you’ll need: import Numeric ( readFloat , readSigned ) import Text . P a r s e r C o m b i n a t o r s . P a r s e c import C o n t r o l . Monad ( mzero ) Other functions you’ll need: getInput setInput The type of your parser should look like this: parseNumber : : C h a r P a r s e r ( ) R a t i o n a l
  • 96. Experimenting with your parser Simply load your code into ghci, and start playing: Prelude> :load MyParser *Main> parseTest parseNumber quot;3.14159quot;
  • 97. My number parser parseNumber = do s <− g e t I n p u t case readSigned r e a d F l o a t s o f [ ( n , s ’ ) ] −> s e t I n p u t s ’ >> r e t u r n n −> mzero <?> ” number ”
  • 98. Using JSON in Haskell A good JSON package is already available from Hackage: http://tinyurl.com/hs-json The module is named Text.JSON Doesn’t use overlapping instances
  • 99. Part 3 This was going to be a concurrent web application, but I ran out of time. It’s still going to be informative and fun!
  • 100. Concurrent programming The dominant programming model: Shared-state threads Locks for synchronization Condition variables for notification
  • 101. The prehistory of threads Invented independently at least 3 times, circa 1965: Dijkstra Berkeley Timesharing System PL/I’s CALL XXX (A, B) TASK; Alas, the model has barely changed in almost half a century.
  • 102. What does threading involve? Threads are a simple extension to sequential programming. All that we lose are the following: Understandability, Predictability, and Correctness
  • 103. Concurrent Haskell Introduced in 1996, inspired by Id. Provides a forkIO action to create threads. The MVar type is the communication primitive: Atomically modifiable single-slot container Provides get and put operations An empty MVar blocks on get A full MVar blocks on put We can use MVars to build locks, semaphores, etc.
  • 104. What’s wrong with MVars? MVars are no safer than the concurrency primitives of other languages. Deadlocks Data corruption Race conditions Higher order programming and phantom typing can help, but only a little.
  • 105. The fundamental problem Given two correct concurrent program fragments: We cannot compose another correct concurrent fragment from them without great care.
  • 106. Message passing is no panacea It brings its own difficulties: The programming model is demanding. Deadlock avoidance is hard. Debugging is really tough. Don’t forget coherence, scaling, atomicity, ...
  • 107. Lock-free data structures A focus of much research in the 1990s. Modus operandi: find a new lock-free algorithm, earn a PhD. Tremendously difficult to get the code right. Neither a scalable or sustainable approach! This inspired research into hardware support, followed by: Software transactional memory
  • 108. Software transactional memory The model is loosely similar to database programming: Start a transaction. Do lots of work. Either all changes succeed atomically... ...Or they all abort, again atomically. An aborted transaction is usually restarted.
  • 109. The perils of STM STM code needs to be careful: Transactional code must not perform non-transactional actions. On abort-and-restart, there’s no way to roll back dropNukes()! In traditional languages, this is unenforceable. Programmers can innocently cause serious, hard-to-find bugs. Some hacks exist to help, e.g. tm callable annotations.
  • 110. STM in Haskell In Haskell, the type system solves this problem for us. Recall that I/O actions have IO in their type signatures. STM actions have STM in their type signatures, but not IO. The type system statically prevents STM code from performing non-transactional actions!
  • 111. Firing up a transaction As usual, we can explore APIs in ghci. The atomically action launches a transaction: Prelude> :m +Control.Concurrent.STM Prelude Control.Concurrent.STM> :type atomically atomically :: STM a -> IO a
  • 112. Let’s build a game—World of Haskellcraft Our players love to have possessions. data I t e m = S c r o l l | Wand | Banjo d e r i v i n g ( Eq , Ord , Show) −− i n v e n t o r y data I n v = I n v { i n v I t e m s : : [ Item ] , invCapacity : : Int } d e r i v i n g ( Eq , Ord , Show)
  • 113. Inventory manipulation Here’s how we set up mutable player inventory: import C o n t r o l . C o n c u r r e n t .STM type I n v e n t o r y = TVar I n v n e w I n v e n t o r y : : I n t −> IO I n v e n t o r y n e w I n v e n t o r y cap = newTVarIO I n v { i n v I t e m s = [ ] , i n v C a p a c i t y = cap } The use of curly braces is called record syntax.
  • 114. Inventory manipulation Here’s how we can add an item to a player’s inventory: a d d I t e m : : I t e m −> I n v e n t o r y −> STM ( ) a d d I t e m i t e m i n v = do i <− readTVar i n v writeTVar inv i { i n v I t e m s = item : i n v I t e m s i } But wait a second: What about an inventory’s capacity? We don’t want our players to have infinitely deep pockets!
  • 115. Checking capacity GHC defines a retry action that will abort and restart a transaction if it cannot succeed: i s F u l l : : I n v −> Bool i s F u l l ( I n v i t e m s cap ) = l e n g t h i t e m s == cap a d d I t e m i t e m i n v = do i <− readTVar i n v when ( i s F u l l i ) retry writeTVar inv i { i n v I t e m s = item : i n v I t e m s i }
  • 116. Let’s try it out Save the code in a file, and fire up ghci: *Main> i <- newInventory 3 *Main> atomically (addItem Wand i) *Main> atomically (readTVar i) Inv {invItems = [Wand], invCapacity = 3} What happens if you repeat the addItem a few more times?
  • 117. How does retry work? In principle, all the runtime has to do is retry the transaction immediately, and spin tightly until it succeeds. This might be correct, but it’s wasteful. What happens instead? The RTS tracks each mutable variable touched during a transaction. On retry, it blocks the transaction until at least one of those variables is modified. We haven’t told GHC what variables to wait on: it does this automatically!
  • 118. Your turn! Write a function that removes an item from a player’s inventory: r e m o v e I t e m : : I t e m −> I n v e n t o r y −> STM ( )
  • 119. My item removal action r e m o v e I t e m i t e m i n v = do i <− readTVar i n v case break (==i t e m ) ( i n v I t e m s i ) o f ( ,[]) −> r e t r y ( h , ( : t ) ) −> w r i t e T V a r i n v i { i n v I t e m s = h ++ t }
  • 120. Your turn again! Write an action that lets us give an item from one player to another: g i v e I t e m : : I t e m −> I n v e n t o r y −> I n v e n t o r y −> STM ( )
  • 121. My solution g i v e I t e m i t e m a b = do removeItem item a addItem item b
  • 122. What about that blocking? If we’re writing a game, we don’t want to block forever if a player’s inventory is full or empty. We’d like to say “you can’t do that right now”.
  • 123. One approach to immediate failure Let’s call this the C programmer’s approach: a d d I t e m 1 : : I t e m −> TVar I n v −> STM Bool a d d I t e m 1 i t e m i n v = do i <− readTVar i n v if isFull i then r e t u r n F a l s e e l s e do writeTVar inv i { i n v I t e m s = item : i n v I t e m s i } r e t u r n True
  • 124. What is the cost of this approach? If we have to check our results everywhere: The need for checking will spread Sadness will ensue
  • 125. The Haskeller’s first loves We have some fondly held principles: Abstraction Composability Higher-order programming How can we apply these here?
  • 126. A more abstract approach It turns out that the STM monad is a MonadPlus instance: i m m e d i a t e l y : : STM a −> STM (Maybe a ) immediately act = ( Just ‘ l i f t M ‘ a c t ) ‘ mplus ‘ r e t u r n Nothing
  • 127. What does mplus do in STM? This combinator is defined as orElse : o r E l s e : : STM a −> STM a −> STM a Given two transactions j and k: If transaction j must abort, perform transaction k instead.
  • 128. A complicated specification We now have all the pieces we need to: Atomically give an item from one player to another. Fail immediately if the giver does not have it, or the recipient cannot accept it. Convert the result to a Bool.
  • 129. Compositionality for the win Here’s how we glue the whole lot together: import Data . Maybe ( i s J u s t ) giveItemNow : : I t e m −> I n v e n t o r y −> I n v e n t o r y −> IO Bool giveItemNow i t e m a b = liftM isJust . atomically . immediately $ r e m o v e I t e m i t e m a >> a d d I t e m i t e m b Even better, we can do all of this as nearly a one-liner!
  • 130. Thank you! I hope you found this tutorial useful! Slide source available: http://tinyurl.com/defun08