Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Haskell

825 Aufrufe

Veröffentlicht am

Some notes about the Haskell programming language and Functional Programming for intermediate programmers

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Haskell

  1. 1. Haskell programming Roberto Casadei July 20, 2016 R. Casadei Haskell July 20, 2016 1 / 49
  2. 2. About these notes I am a learner, not an expert These notes are essentially a work of synthesis and integration from many sources, such as “Real World Haskell” [O’Sullivan et al., 2008] “Parallel and Concurrent Programming in Haskell” [Marlow, 2012] University notes Web sources: Wikipedia, Blogs, etc. (references in slides) R. Casadei Haskell July 20, 2016 2 / 49
  3. 3. Outline 1 Basic Haskell programming Basics Intermediate stuff 2 Real world Haskell 3 Articles Scrap Your Boilerplate: A Practical Design Pattern for Generic Programming R. Casadei Haskell July 20, 2016 3 / 49
  4. 4. Basic Haskell programming Basics Outline 1 Basic Haskell programming Basics Intermediate stuff 2 Real world Haskell 3 Articles Scrap Your Boilerplate: A Practical Design Pattern for Generic Programming R. Casadei Haskell July 20, 2016 4 / 49
  5. 5. Basic Haskell programming Basics Haskell: Overview1 History 1990 – Haskell 1.0 1999 – Haskell 98 standard published 2010 – Haskell 2010 standard published: adds the Foreign Function Interface, .. Main traits Purely functional programming language Strong, static type system based on Hindley-Miner type inference 1 http://en.wikipedia.org/wiki/Haskell_(programming_language) R. Casadei Haskell July 20, 2016 5 / 49
  6. 6. Basic Haskell programming Basics Glasgow Haskell Compiler (GHC) Three main parts: ghc (compiler), ghci (interactive interpreter), runghc (for running scripts) Basic ghci commands 1 $ ghci 2 Prelude> :? -- "Getting help" 3 Prelude> :set prompt "ghci> " -- "Set prompt label" 4 ghci> :module + Data.Ratio -- "Load module" 5 ghci> :info (+) -- "Provides info about functions & type classes/ constructors" 6 ghci> :type expr -- "Show type of expression" 7 ghci> :set +s -- "Print time to calculate for each expression" 8 ghci> :set +t -- "Print the type after each evaluated expr in the repl" 9 ghci> 7 10 ghci> it -- 7, "it" points to the last evaluated expression Package manager 1 $ ghc-pkg list -- "Shows the list of packages" R. Casadei Haskell July 20, 2016 6 / 49
  7. 7. Basic Haskell programming Basics Haskell programs Extension .hs They consist of a set of function (and variable) definitions 1 -- ex0.hs 2 3 main = putStrLn name -- program start point 4 5 name = "jordan" 6 7 myadd x y = x+y Compilation, linking, and execution 1 $ ghc ex0.hs -- A) compilation+linking 2 $ ghc --make ex0 -- B) compilation+linking 3 $ ./ex0 You can also load it in ghci 1 ghci> :cd /path/to/my/code 2 ghci> :load ex0 R. Casadei Haskell July 20, 2016 7 / 49
  8. 8. Basic Haskell programming Basics Haskell modules: basics Reference: https://www.haskell.org/tutorial/modules.html A Haskell program consists of a collection of modules. Modules serve 2 purposes: controlling name-spaces and creating abstract data types The top level of a module contains various declarations: fixity declarations, data and type declarations, class and instance declarations, type signatures, function definitions, and pattern bindings. Import declarations must appear first; the other decls may appear in any order (top-level scope is mutually recursive) The name-space of modules is completely flat, and modules are in no way "first-class." Module names are alphanumeric and must begin with an uppercase letter A module source file (Tree.hs) must have the same filename as the module’s name (Tree) Technically speaking, a module is really just one big declaration which begins with the keyword module 1 module Tree ( Tree(Leaf,Branch), fringe ) -- Explicit exports 2 where -- The module body follows 3 4 data Tree a = Leaf a | Branch (Tree a) (Tree a) 5 6 fringe :: Tree a -> [a] 7 fringe (Leaf x) = [x] 8 fringe (Branch left right) = fringe left ++ fringe right If the export list is omitted, all of the names bound at the top level would be exported Compilation ghc -c Tree.hs where -c tells to only generate the object code (no executable), as we’ve provided no main Results of compilation: Tree.o (object file), and Tree.hi (interface file, which stores info about exports etc.) Module import 1 module Main (main) where 2 import Tree ( Tree(Leaf,Branch), fringe ) 3 import qualified Foo ( fringe ) 4 5 main = do print (fringe (Branch (Leaf 1) (Leaf 2))) 6 print (Foo.fringe (Branch (Leaf 1) (Leaf 2))) R. Casadei Haskell July 20, 2016 8 / 49
  9. 9. Basic Haskell programming Basics Hackage and Haskell packages Hackage is a repository of Haskell libraries Manual installation of packages Download and unpack PACKAGE-VERS.tar.gz cd PACKAGE_VERS 1 $ runhaskell Setup configure 2 $ runhaskell Setup build 3 $ sudo runhaskell Setup install Install with Cabal 1 $ cabal install -- Package in the current directory 2 $ cabal install foo -- Package from the Hackage server 3 $ cabal install foo-1.0 -- Specific version of a package 4 $ cabal install ’foo < 2’ -- Constrained package version 5 $ cabal install foo bar baz -- Several packages at once 6 $ cabal install foo --dry-run -- Show what would be installed R. Casadei Haskell July 20, 2016 9 / 49
  10. 10. Basic Haskell programming Basics Cabal Cabal is a system for building and packaging Haskell libraries and programs Installation 1 $ tar xzf Cabal-1.22.4.tar.gz 2 $ cd Cabal-1.22.4 3 $ ghc -threaded --make Setup 4 $ ./Setup configure 5 $ ./Setup build 6 $ sudo ./Setup install R. Casadei Haskell July 20, 2016 10 / 49
  11. 11. Basic Haskell programming Basics Cabal: create packages Creating a package (mypkg) 1 Create a file mypkg.cabal 1 Name: mypkg -- Package name must be unique 2 Version: 0.1 3 Synopsis: My nice package 4 Description: 5 This package is intended for.. 6 ..and.. 7 Author: Real World Haskell 8 Maintainer: aaa@bbb.cc 9 10 library 11 Exposed-Modules: MyModule1 12 MyModule2 13 Build-Depends: base >= 2.0 -- "base" contain core Haskell modules 2 Create a setup file Setup.hs 1 #!/usr/bin/env runhaskell 2 import Distribution.Simple 3 main = defaultMain 3 Configure, build, and install 1 $ runghc Setup configure -- CONFIGURE 2 $ runghc Setup build -- BUILD 3 $ sudo runghc Setup install -- INSTALL 4 $ ghc-pkg list | grep mypkg -- Check for presence R. Casadei Haskell July 20, 2016 11 / 49
  12. 12. Basic Haskell programming Basics Literate programming in Haskell2 Basics The idea: “The main idea is to regard a program as a communication to human beings rather than as a set of instructions to a computer” (D. Knuth) In practice, we need to distinguish between code portions and non-code portions A literate program has extension .lhs Two styles 1 Bird style: source code lines are prepended with > 1 Here’s my code: 2 3 > add x y = x + y 4 > succ = add 1 5 6 I am a literate programmer. 2 Latex style: source code is sorrounded with begin{code} and end{code} 2 https://wiki.haskell.org/Literate_programming R. Casadei Haskell July 20, 2016 12 / 49
  13. 13. Basic Haskell programming Basics Basics: ghci, arithmetic, Boolean logic, strings, chars, lists Boolean values: True, False :info (+) from ghci tells you the signature of the + operator, its precedence level, and its associativity (in this case, infixl, so left-associative) 1 -- Arithmetic 2 2^3^2 -- 512 = 2^9 = (^) is right associative 3 1^pi -- Error, (^) can only raise to an int value 4 1**pi -- 1.0 5 (/) 7 3 -- 2.333333333 (prefix form) 6 7 -- Boolean logic and equivalence 8 7 == 7.0 -- True 9 False || (7 /= 7.0) || not True -- False 10 11 -- ghci 12 ghci> it -- False, in ghci ’it’ refers to last evaluated expr 13 ghci> let e = exp 1 -- Introduces a variable 14 ghci> :type 7 -- 7 :: Num a => a 15 ghci> :type 7.0 -- 7.0 :: Fractional a => a 16 ghci> :type pi -- pi :: Floating a => a 17 ghci> :module + Data.Ratio 18 ghci> let z = 3%2 + 1 -- (z = 5 % 2) :: Ratio Integer 19 20 -- Strings/chars/lists 21 "hello" == [’h’,’e’,’l’,’l’,’o’] -- True (A string is a list of chars) 22 ’a’ : "bc" -- "abc" (list cons) 23 "" == [] -- True (empty list is synonym for nil) 24 "foo" ++ "bar" -- "foobar" (list concatenation) 25 26 -- List comprehension 27 [(x,y) | x <- [1..3], (mod x 2) /= 0, y <- [0..1]] -- [(1,0),(1,1),(3,0),(3,1)] R. Casadei Haskell July 20, 2016 13 / 49
  14. 14. Basic Haskell programming Basics Basics: functions I Currying is converting a function that accepts n args to a function that accepts a single arg When we pass to a function fewer args than it accepts, we call it a partial application of the function Strict function: one that, when applied to a non-terminating expr, also fails to terminate Lazy evaluation: impl of normal-order reduction where expressions are evaluated by need With lazy evaluations, functions become non-strict In a language that uses strict evaluation, the args to a function are evaluated before the function is applied. Instead, Haskell has non-strict evaluation: expressions passed to a function become thunks that will be evaluated by-need (lazily) 1 let lst = [1,2, (error "ops"), 4,5] 2 head lst -- 1 3 drop 3 lst -- [3,4] 4 lst -- [1,2,*** Exception: ops Closure: function + referencing environment Higher-order functions accepts functions as arguments and/or return functions as values Lambdas are anonymous functions Partial functions are functions defined only for a subset of valid inputs, whereas we call total functions those that return valid results over their entire input domain R. Casadei Haskell July 20, 2016 14 / 49
  15. 15. Basic Haskell programming Basics Basics: functions II 1 -- In ghci, to declare a function you have to use ’let’ 2 ghci> let fac n | n==0 = 1 | otherwise = n * fac (n-1) -- GUARDS 3 ghci> let { fac2 0 = 1; fac2 n = n * fac (n-1) } -- MULTIPLE CASES 4 5 -- Function declaration in a Haskell module 6 fac 0 = 1 7 fac n | n>0 = n * fac(n-1) 8 fac -5 -- Exception: non-exhaustive pattern in function fac2 9 10 -- Currying 11 add (x, y) = x + y -- (UNCURRIED) ’add’ takes ONE arg, which is a 2-elem tuple 12 prod x y = x * y -- (CURRIED) 13 double = prod 2 14 15 -- Higher-order functions 16 foldr f init [] = init 17 foldr f init (x:xs) = f x (foldr f init xs) 18 foldr (+) 0 (1..10) -- 55 19 20 ones = 1 : ones -- Non-terminating expr, but fun is NON-STRICT thanks to lazy eval 21 take 3 ones -- [1,1,1] 22 numsfrom x = x : numsfrom(x+1) 23 24 -- Anonymous functions (lambda abstractions) 25 (x y -> x + y) 5 3 -- 8 26 ghci> let mean3 = x y z -> (x+y+z)/3 27 28 -- Closures 29 ghci> 7 -- 7 30 ghci> let f x = it + x 31 ghci> f 10 -- 17 32 ghci> f 10 -- 17 (of course, there’s immutability) R. Casadei Haskell July 20, 2016 15 / 49
  16. 16. Basic Haskell programming Basics Basics: functions III 1 -- INFIX notation for functions accepting two args 2 (>= 7) ‘filter‘ [1..10] ; [7,8,9,10] 3 -- When a function accepts more than 2 args, to use as infix op you have to use currying 4 (+) ‘foldl‘ 0 $ [1..10] 5 6 -- More about the use of $ to change precedence 7 putStrLn show (1+1) -- Error: ’putStrLn’ is applied to two arguments 8 putStrLn(show (1 + 1)) -- OK, but a lot of parenthesis 9 putStrLn $ show (1 + 1) 10 putStrLn ( show $ 1 + 1 ) 11 putStrLn $ show $ 1 + 1 $ is infix application: f $ x = f x (type ($) :: (a -> b) -> (a -> b)) A section is a partial application of an infix function 1 -- The following identities hold 2 (op e) = x -> x op e 3 (e op) = x -> e op x 4 5 -- Example 6 filter (3 <) [1..5] -- [4,5] 7 filter (< 3) [1..5] -- [1,2] 8 filter (‘elem‘ [’a’..’z’]) "BooBs" -- "oos" As-patterns: x@e means “bind x to the value that matches the right side of @” R. Casadei Haskell July 20, 2016 16 / 49
  17. 17. Basic Haskell programming Basics Basics: functions IV 1 suffixes xs@(_:xs’) = xs : suffixes xs’; 2 suffixes _ = [] 3 4 -- Compare with the version without as-pattern 5 suffixesLong (x:xs) = (x:xs) : suffixesLong xs 6 suffixesLong _ = [] 7 -- Note that (x:xs) is copied in the function body => a new node allocation Function composition 1 capCount = length . filter (Data.Char.isUpper . head) . words 2 3 capCount "Real World Haskell" -- 3 R. Casadei Haskell July 20, 2016 17 / 49
  18. 18. Basic Haskell programming Basics Lazy evaluation in Haskell I Notes taken from https://hackhands.com/lazy-evaluation-works-haskell/ Expressions 1 square x = x * x 2 square (1+2) -- We can eval it by replacing square’s LHS with its RHS while substituting x 3 -- (1+2) * (1+2) --> 3 * (1+2) --> 3 * 3 --> 9 (Evaluation trace) Note that (1 + 2) has been evaluated twice, even though we know the two (1 + 2) are actually the same expression. To avoid this unnecessary duplication, we use a method called graph reduction Every function corresponds to a reduction rule Above, circle x is a subgraph. Notice how both args of (∗) point to same subgraph. Sharing a subgraph in this way is the key to avoiding duplication Any subgraph that matches a rule is called a reducible expression (or redex for short) R. Casadei Haskell July 20, 2016 18 / 49
  19. 19. Basic Haskell programming Basics Lazy evaluation in Haskell II Whenever we have a redex, we can reduce it. Whenever an expression (graph) does not contain any redexes, there is nothing we can reduce anymore and we say it is in normal form. Actually there are other 2 requirements: the graph must be also finite and acyclic Note: also expressions created out of data constructors are in normal form, e.g., 1:2:3:[] 1 ones = 1 : ones -- Has NO redexes, but it’s NOT in normal form because of the cycle In Haskell, we usually don’t evaluate all the way down to normal form. Rather, we often stop once the graph has reached weak head normal form (WHNF). We say that a graph is in WHNF if its topmost node is a constructor Any graph that is not in WHNF is called an unevaluated expression (or thunk) The graph of ones is in WHNF See next Figure: left node is a redex (and a thunk), but the graph is in WHNF R. Casadei Haskell July 20, 2016 19 / 49
  20. 20. Basic Haskell programming Basics Lazy evaluation in Haskell III When an expression contains multiple redexes, we have multiple options about the reduction order Eager evaluation: eval function args to normal form before reducing the function application itself Lazy evaluation: tries to reduce the topmost function application first For that, some function arguments may need to be evaluated, but only as far as necessary In general, a normal form obtained by lazy evaluation never differs from the result obtained by performing eager evaluation on the same expression, so in that sense, it doesn’t matter in which order we reduce expressions. However, lazy evaluation uses fewer reduction steps, and it can deal with some cyclic (infinite) graphs that eager evaluation cannot Textual representation of trace reductions (using Haskell syntax) We have to indicate shared expressions by giving them a name using the let keyword 1 square (1+2) ==> let x = (1+2) in x*x ==> let x = 3 in x*x => 9 2 (’H’ == ’i’) && (’a’ == ’m’) ==> False && (’a’ == ’m’) ==> False 3 -- No shared expressoion in latter evaluation R. Casadei Haskell July 20, 2016 20 / 49
  21. 21. Basic Haskell programming Basics Lazy evaluation in Haskell IV Time complexity With eager evaluation, for every function application, we add the time needed to evaluate the arguments to the time needed to evaluate the function body Theorem: Lazy evaluation never performs more evaluation steps than eager evaluation. That said, the implementation of lazy evaluation does incur a certain administrative overhead Space complexity An expression uses as much memory as its graph contains nodes So, when a scenario such as ((((0 + 1) + 2) + 3) + 4) grows out of hand, we get a space leak The solution is to take control of the evaluation process and make sure that the expression is evaluated sooner (see about seq next) In the case of foldl (+) 0 [1..1000000] we have to make sure that the accumulating param is always in WHNF R. Casadei Haskell July 20, 2016 21 / 49
  22. 22. Basic Haskell programming Basics Space leaks and strict evaluation I foldl vs. foldr vs. foldl’ 3 1 foldr (+) 0 [1..1000000] 2 ==> 0 + (1 + (2 + (3 + (...(1000000)))..)))) The problem is that (+) is strict in both of its arguments. This means that both arguments must be fully evaluated before (+) can return a result. So to evaluate 1 + (...) , 1 is pushed on the stack. Then the same for 2 + (...), for 3 + (...), etc., where 2,3, etc., are pushed on the stack. Thus we may get a stack overflow Issue: the chain of (+)’s can’t be made smaller (reduced) until the very last moment, when it’s already too late. We can’t reduce it as the chain doesn’t contain an expr which can be reduced (a redex, for reducible expression.) If it did, we could reduce that expr before going to the next element. We can introduce a redex by forming the chain in another way, using foldl – in (((..(((0 + 1) + 2) + 3)...) we can calculate 0 + 1 before calculating the next sum R. Casadei Haskell July 20, 2016 22 / 49
  23. 23. Basic Haskell programming Basics Space leaks and strict evaluation II 1 foldl (+) 0 [1..1000000] 2 3 ==> let z1 = 0 + 1 4 z2 = z1 + 2 5 z3 = z2 + 3 6 z4 = z3 + 4 7 ........... 8 z999999 = z999998 + 999999 9 in foldl (+) z999999 [1000000] --> 10 11 ==> let { z1 = 0 + 1; ...; z1000000 = z999999 + 1000000 } in z1000000 12 13 -- Now, to evaluate z1000000, a large chain of +’s will be created 14 15 ==> let z1 = 0 + 1 16 in ((((((((z1 + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 17 18 ==> (((((((((0 + 1) + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 19 20 -- Now we can actually start reducing 21 ==> ((((((((1 + 2) + 3) + 4) + 5) + ...) + 999997) + 999998) + 999999) + 1000000 22 ........................................................................ 23 ==> (499998500001 + 999999) + 1000000 --> 24 ==> 499999500000 + 1000000 --> 25 ==> 500000500000 You see that the redexes are created (z1, z2, ...), but instead of being directly reduced, they are allocated on the heap Note that your heap is only limited by the amount of memory in your system (RAM and swap) The problem starts when we finally evaluate z1000000: we must evaluate z1000000 = z999999 + 1000000 (a huge thunk), so 1000000 is pushed on the stack, and so on.. until a stack overflow exception But this is exactly the problem we had in the foldr case — only now the chain of (+)’s is going to the left instead of the right. So why doesn’t the chain reduce sooner than before? R. Casadei Haskell July 20, 2016 23 / 49
  24. 24. Basic Haskell programming Basics Space leaks and strict evaluation III It’s because of GHC’s lazy reduction strategy: expressions are reduced only when they are actually needed (i.e., topmost function application are reduced first). As the outer-left-most redexes are reduced first, the inner z1, z2, ... redexes only get reduced when the foldl is completely gone. We somehow have to tell the system that the inner redex should be reduced before the outer. Fortunately this is possible with the seq function 1 seq :: a -> b -> b seq is a function that when applied to x and y will first reduce x (i.e., forces its evaluation) and then return y The idea is that y references x so that when y is reduced x will not be a big unreduced chain anymore. 1 foldl’ f z [] = z 2 foldl’ f z (x:xs) = let z’ = z ‘f‘ x 3 in seq z’ $ foldl’ f z’ xs 4 5 foldl (+) 0 [0..1000000] 6 ==> foldl’ (+) 0 [1..1000000] 7 ==> let a’ = 0 + 1 in seq a’ (foldl’ (+) a’ [2..100]) --> 8 ............................. 9 ==> foldl’ (+) 10 [5..1000000] 10 ............................. 11 ==> foldl’ (+) 500000500000 [] 12 ==> 500000500000 foldl vs. foldl’ foldl’ more efficient than foldl because the former doesn’t build a huge thunk If the combining function is lazy in the first argument, foldl may happily return a result where foldl’ hits an exception R. Casadei Haskell July 20, 2016 24 / 49
  25. 25. Basic Haskell programming Basics Space leaks and strict evaluation IV 1 let and x y = y && x 2 foldl (and) True [undefined,False] -- False 3 Data.List.foldl’ (and) True [undefined,False] -- *** Exception: Prelude.undefined The involved seq function does only evaluate the top-most constructor. If the accumulator is a more complex object, then fold’ will still build up unevaluated thunks. You can introduce a function or a strict data type which forces the values as far as you need. Failing that, the "brute force" solution is to use deepseq Rules of thumbs for folds 4 foldr is commonly the right fold to use, in particular when transforming foldables into lists with related elements in the same order, while foldl’ conceptually reverses the order of the list foldl’ often has much better time and space performance than a foldr would So you should pick foldl’ when 1 When the input list is large (but definitely finite), you do not care about the implicit reversal, and you seek to improve the performance of your code. 2 When you actually do want to reverse the order of the list, possibly along with other transformation Foldl is rarely the right choice. It gives you the implicit reverse of fold, but without the performance gains of foldl’. Only in rare cases it may yield better results than foldl’ Another reason that foldr is often the better choice is that the folding function can short-circuit, i.e., terminate early by yielding a result which does not depend on the value of the accumulating parameter. Instead, the left fold cannot short-circuit and is condemned to evaluate the entire input list R. Casadei Haskell July 20, 2016 25 / 49
  26. 26. Basic Haskell programming Basics Space leaks and strict evaluation V 1 ghci> :set +s -- Set to print time for expressions 2 3 trues = True : trues 4 t = False : take 10000000 trues 5 foldl (&&) True t -- False (3.39 secs, 507955768 bytes) 6 -- (((((((T && F) && T) && T) && ...) 7 foldr (&&) True t -- False (0.00 secs, 521384 bytes) 8 -- (T && (F && XXXXXXXXXXXX)) 9 Data.List.foldl’ (&&) True t -- False (0.12 secs, 1133716 bytes) 10 11 t = take 10000000 trues ++ [False] 12 foldl (&&) True t -- False (3.36 secs, 926383156 bytes) 13 foldr (&&) True t -- False (0.58 secs, 160439692 bytes) 14 Data.List.foldl’ (&&) True t -- False (0.14 secs, 0 bytes) 3 http://wiki.haskell.org/Foldr_Foldl_Foldl’ 4 Also see http://stackoverflow.com/questions/20356742/foldr-foldl-haskell-explanation R. Casadei Haskell July 20, 2016 26 / 49
  27. 27. Basic Haskell programming Basics Algebraic data types I data Bool = True | False Bool is the type constructor whereas True,False are the data constructors data Point a = Point a a a is a type variable, and Point is said to be a parameterized type It is legal and normal for a value constructor to have the same name of the type constructor Point value constructor has two components of type a data List a = Nil | Cons a (List a) data Tree a = Empty | Leaf a | Branch (Tree a) (Tree a) List and Tree are recursive, data types Usage in functions and pattern matching 1 countNodes Empty = 0 2 countNodes (Leaf _) = 1 3 countNodes (Branch left right) = 1 + countNodes(left) + countNodes(right) 4 5 countNodes (Branch (Branch Empty (Leaf 1)) Empty) -- 3 6 7 let t = Leaf pi 8 9 case t of -- Pattern matching allow us to: 10 (Branch _ _) -> "branch"; -- 1) know what data constructor was used 11 (Leaf x) -> "leaf: " ++ show x; -- 2) extract components from data constructors 12 Empty -> "empty" 13 -- "leaf: 3.141592653589793" R. Casadei Haskell July 20, 2016 27 / 49
  28. 28. Basic Haskell programming Basics Algebraic data types II Algebraic data types vs. tuples ADTs allow us to distinguish between otherwise identical pieces of information Two tuples with elements of the same type are structurally identical, so they have the same type Record syntax Write accessor functions for each of a data type’s components can be repetitive and tedious 1 data Customer = Customer { name :: String, age :: Int } deriving(Show) 2 let c = Customer "Gigio" 22 3 let c2 = Customer { age = 22, name = "Gigio" } -- More verbose but clearer 4 name c -- "Gigio" 5 age c -- 22 Common ADTs The Prelude defines a type Maybe which can be used to represent a value that could be either present or missing 1 data Maybe a = Just a | Nothing R. Casadei Haskell July 20, 2016 28 / 49
  29. 29. Basic Haskell programming Basics Typeclasses I Typeclasses allow us to define generic interfaces for a wide variety of types Typeclasses define a set of functions that can have different implementations depending on the type of data they are given Consider the built-in Eq typeclass definition 1 class Eq a where 2 (==), (/=) :: a -> a -> Bool 3 4 x /= y = not (x == y) 5 x == y = not (x /= y) 6 7 -- Minimal complete definition requires (==) or (/=) Declaration of typeclass instances Types are made instances of a typeclass by implementing the functions necessary for that typeclass 1 data MyData a = ... 2 3 instance Eq MyData a where 4 (==) x y = ... Important built-in typeclasses Show is used to convert values of type a (member of Show) to Strings 1 show :: (Show a) => a -> String R. Casadei Haskell July 20, 2016 29 / 49
  30. 30. Basic Haskell programming Basics Typeclasses II Read is the opposite of Show: it takes a String, parse it, and return data in any type a that is member of Read 1 read :: (Read a) => String -> a Eq supports ==, /= Ord supports <=, =>. Instances of Ord can be sorted by Data.List.sort Notes read and show can be used for implementing de/serialization of data structures For many data types, Haskell compiler can automatically derive instances of Read, Show, Bounded, Enum, Eq, Ord 1 data Color = Red | Green | Blue 2 deriving(Read, Show, Eq, Ord) 3 4 show Red -- "Red" 5 (read "Red") :: Color -- Red 6 (read "[Red,Red,Blue]") :: [Color] -- [Red,Red,Blue] 7 Data.List.sort [Blue, Green, Red] -- [Red, Green, Blue] 8 Red == Red -- True 9 Red < Blue -- True R. Casadei Haskell July 20, 2016 30 / 49
  31. 31. Basic Haskell programming Basics Haskell’s Type System Haskell has a strong, static type system and support for type inference Strongness refers to how much a type system is permissive wrt expressions Static because the compiler knows the type of every expression at compile-time Haskell’s combination of strong and static typing makes it impossible for type errors to occur at runtime In summary, strong and static type system makes Haskell safe while type inference make it concise Composite data types are constructed from other type. Examples: lists, tuples Haskell also supports polymorphic types (or parameterized types) via parametric polymorphism where you can define types based on one or more type parameters You can introduce type aliases via: type TCustomerName = String Common types Char: represents a Unicode char Bool: with values True and False Int: for signed, fixed-width integer values (range depends on the system’s longest “native” integer) Integer: signed int of unbounded size Double (typically 64-bits wide), Float (narrower): for floating-point numbers Function types, e.g.: take :: Int -> [a] -> [a] Note that -> is right-associative: take :: Int -> ([a] -> [a]) (cf. with currying) R. Casadei Haskell July 20, 2016 31 / 49
  32. 32. Basic Haskell programming Basics Lists I All standard list functions are defined in the Data.List module The Prelude re-export many of these functions (but not all) 1 1 : [] -- [1] (CONS) 2 "foo" : [1,2] -- Error: lists are homogeneous (cannot mix chars and ints) 3 4 null [] && null "" -- True ("" == []) 5 length [] -- 0 6 head [1..5] -- 1 (Exception on empty list) 7 tail [1..5] -- [2,3,4,5] (Exception on empty list) 8 last [1..5] -- 5 9 init [1..5] -- [1,2,3,4] 10 11 drop 4 [1..5] -- [5] 12 drop 100 [1..5] -- [] 13 take 3 [1..5] -- [1,2,3] 14 take 0 [1..5] -- [] 15 takeWhile (<3) [1..5] -- [1,2] 16 dropWhile (<3) [1..5] -- [3,4,5] 17 break (== 3) [1..5] -- ([1,2],[3,4,5]) "it tuples up the results of takeWhile" 18 span (/= 3) [1..5] -- ([1,2],[3,4,5]) "it tuples up the results of dropWhile" 19 20 [1..2] ++ [3..4] -- [1,2,3,4] 21 concat [ [1..2], [5], [7..8] ] -- [1,2,5,7,8] 22 reverse [1..5] -- [5,4,3,2,1] 23 24 -- For lists of Bools 25 and [True, False, True] -- False 26 or [False, False, True] -- True 27 28 all odd [1,3,5] -- True 29 any even [1..3] -- True 30 R. Casadei Haskell July 20, 2016 32 / 49
  33. 33. Basic Haskell programming Basics Lists II 31 splitAt 3 [1..5] -- ([1,2,3],[4,5]) 32 33 3 ‘elem‘ [1..9] -- True (check for presence) 34 77 ‘notElem‘ [1..9] -- True (check for absence) 35 36 Data.List.isPrefixOf "foo" "foobar" -- True 37 Data.List.isInfixOf [5..8] [1..10] -- True 38 Data.List.isSuffixOf [7..10] [1..10] -- True 39 Data.List.tails "foobar" -- ["foobar","oobar","obar","bar","ar","r",""] 40 41 zip [1..5] [’a’..’z’] -- [(1,’a’),(2,’b’),(3,’c’),(4,’d’),(5,’e’)] 42 zipWith (+) [3,5,0] [4,1,2,8,9] -- [7,6,2] 43 44 lines "foonbar" -- ["foo","bar"] 45 unlines ["foo","bar"] -- "foonbarn" 46 47 map Data.Char.toUpper "foo" -- "FOO" 48 filter odd [1..5] -- [1,3,5] 49 foldl (+) 0 [1..10] -- 55 50 foldr (:) [] [1..10] -- [1,2,3,4,5,6,7,8,9,10] R. Casadei Haskell July 20, 2016 33 / 49
  34. 34. Basic Haskell programming Basics Tuples 1 fst ("a", 7) -- "a" 2 snd ("a", 7) -- 7 3 fst ("a", 7, True) -- Couldn’t match expected type ‘(a0, b0)’ with actual type ‘(t0, t1, t2)’ Note: there is a deep mathematical sense in which any nonpathological function of type (a,b) -> a must do exactly what fst does 5 5 See http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9875 R. Casadei Haskell July 20, 2016 34 / 49
  35. 35. Basic Haskell programming Basics Errors error: [Char] -> a It immediately aborts evaluation and prints the error message we give it It has result type a so that we can call it anywhere and it will always have the right type A more controlled approach involves the use of Maybe or Either 1 data Maybe a = Nothing | Just a 2 deriving(Eq, Ord, Read, Show) 3 4 data Either a b = Left a | Right b -- Convention: Left 4 failure, Right 4 success 5 deriving(Eq, Ord, Read, Show) R. Casadei Haskell July 20, 2016 35 / 49
  36. 36. Basic Haskell programming Basics Local definitions with let and where 1 lend amount balance = let reserve = 100 2 newbalance = balance - amount 3 in if balance < reserve 4 then Nothing 5 else Just newbalance 6 7 lend2 amount balance = if amount < reserve * 0.5 8 then Just newbalance 9 else Nothing 10 where reserve = 100 11 newBalance = balance - amount 12 13 -- NESTED lets and wheres 14 bar = let b = 2 15 c = True 16 in let a = b 17 in (a,c) 18 19 foo = x 20 where x = y 21 where y = 2 Haskell uses indentation as a cue to parse sections of code If the indentation is the same as the start of the preceding item, it is trated as beginning a new item in the same block You could also explicitly define blocks without leveraging on indentation 1 foo = let { a = 1; b = 2; 2 c = 3 } 3 in a + b + c R. Casadei Haskell July 20, 2016 36 / 49
  37. 37. Basic Haskell programming Intermediate stuff Outline 1 Basic Haskell programming Basics Intermediate stuff 2 Real world Haskell 3 Articles Scrap Your Boilerplate: A Practical Design Pattern for Generic Programming R. Casadei Haskell July 20, 2016 37 / 49
  38. 38. Basic Haskell programming Intermediate stuff Skipped parts “Real World Haskell” Ch. 6 – instances with type synonyms, newtype declaration, restrictions on typeclasses R. Casadei Haskell July 20, 2016 38 / 49
  39. 39. Basic Haskell programming Intermediate stuff I/O: Basics I I/O actions Actions are first-class values in Haskell and have type IO a – when performed, may do some I/O before delivering a value of type a They produce an effect when performed, but not when evaluated A possible conceptual representation is type IO a = World -> (a, World Any expr may produce an action as its value, but the action won’t perform I/O until it is executed inside another action Function main itself is an action with type IO (). As you can only perform I/O actions from within other I/O actions all I/O in Haskell program is driven from the top at main (thus providing isolation from side effects) 1 main = do input <- getLine 2 let label = "You said: " -- Note: without "in" 3 putStrLn $ label ++ input Pure vs. Impure Pure: always produces same result for same args; never has side effects; never alters state Impure: may produce different result for same args; may have side effects; may alter the global state of the program/system/world Many bugs in programs are caused by unanticipated side effects or inconsistent behaviors given the same inputs; moreover, managing global side effects makes it more difficult to reason about programs Haskell isolates side effects into I/O actions to provide a clear boundary, so that you know what parts of the system may alter state/world and which won’t, thus facilitating reasoning R. Casadei Haskell July 20, 2016 39 / 49
  40. 40. Basic Haskell programming Intermediate stuff I/O: Basics II Signatures of some I/O actions 1 putStrLn :: String -> IO () -- Function 2 3 getLine :: IO String -- Value Common I/O actions: getChar, putChar, getLine, putStr, putStrLn do blocks do is a convenient way to define a sequence of actions Its value is the value of its last action executed In do blocks, you use <- to get results from IO actions, and let to get results from pure code R. Casadei Haskell July 20, 2016 40 / 49
  41. 41. Basic Haskell programming Intermediate stuff Working with files I Modules: System.IO, System.Directory Example 1 import System.IO 2 import Data.Char(toUpper) 3 4 main = 5 do inh <- openFile "in.txt" ReadMode -- returns a Handle 6 outh <- openFile "out.txt" WriteMode 7 mainloop inh outh 8 hClose inh; hClose outh; 9 where 10 mainloop inh outh = do ineof <- hIsEOF inh 11 if ineof 12 then return () 13 else do inStr <- hGetLine inh 14 hPutStrLn outh (map toUpper inStr) 15 mainloop inh outh 16 -- Note: using hGetContents to lazily get the entire content of the file 17 -- the program would be much shorter Predefined Handles: stdin, stdout, stderr Note that non-“h” functions are actually shortcuts for their “h”-counterparts, e.g., getLine = hGetLine stdin Not all Handles are seekable (check with hIsSeekable) because a Handle may correspond to things different from files, such as network connections, terminals, .. R. Casadei Haskell July 20, 2016 41 / 49
  42. 42. Basic Haskell programming Intermediate stuff Working with files II Useful I/O file functions and data 1 -- System.IO 2 openFile :: FilePath -> IOMode -> IO Handle 3 type FilePath = String 4 data IOMode = ReadMode | WriteMode | AppendMode | ReadWriteMode 5 6 hClose :: Handle -> IO () -- For flushin writes and freeing resources 7 hGetLine :: Handle -> IO String 8 hPutStr :: Handle -> String -> IO () 9 hGetContents :: Handle -> String -- Get the entire content, lazily 10 readFile :: FilePath -> IO String 11 writeFile :: FilePath -> String -> IO () 12 13 hTell :: Handle -> IO Integer -- Current position in file (byte offset) 14 hSeek :: Handle -> SeekMode -> Integer -> IO () 15 data SeekMode = AbsoluteSeek | RelativeSeek | SeekFromEnd 16 17 openTempFile, openBinaryTempFile :: FilePath -> String -> IO (FilePath, Handle) 18 19 -- System.Directory 20 renameFile :: FilePath -> IO () 21 removeFile :: FilePath -> IO () 22 getCurrentDirectory :: IO FilePath 23 getTemporaryDirectory :: IO FilePath 24 getHomeDirectory :: IO FilePath 25 getDirectoryContents :: FilePath -> IO [FilePath] R. Casadei Haskell July 20, 2016 42 / 49
  43. 43. Basic Haskell programming Intermediate stuff Lazy I/O hGetContents returns a String, representing a file’s content, which is evaluated lazily Data is actually read from the handle as the chars of the string (list) are processed When the chars are no longer used, Haskell’s GC automatically frees memory Moreover, you are not required to ever consume all the data. Since opening a file, reading its content, processing the content, writing transformed content, and closing the file is a common process, shortcuts are provided: readFile and writeFile 1 -- The initial example is not much shorter 2 import Data.Char(toUpper) 3 main = do input <- readFile "in.txt" 4 writeFile "out.txt" (map Data.Char.toUpper input) readFile uses hGetContents internally, and the underlying Handle will be closed when the returned string is garbage-collected writeFile will close its Handle when the supplied string is entirely written On lazy output The string to be written to a file (with putStr or writeFile) is not loaded into memory at once: they write data as it becomes available; in addition, the data already written can be freed as long as nothing else in the program needs it Even shorter with interact :: (String -> String) -> IO () interact accepts a mapping function, applies it to the content read from stdin (getContents), and sends the result to stdout 1 import Data.Char(toUpper) 2 main = interact (map Data.Char.toUpper) 3 -- To be called with: $ runghc Program.hs < in.txt > out.txt 4 5 -- The following example is with filtering: filters the lines containing ’a’ 6 main = interact (unlines . filter (elem ’a’) . lines) 7 -- Note :type filter (elem ’a’) ==> [[Char]] -> [[Char]] R. Casadei Haskell July 20, 2016 43 / 49
  44. 44. Real world Haskell A simple command-line interpreter 1 -- From Real World Haskell - Chapter 4 2 -- Interact.hs 3 4 import System.Environment (getArgs) 5 6 interactWith fun fin fout = do 7 in <- readFile fin 8 writeFile fout (fun in) 9 10 main = mainWith myFun 11 where mainWith f = do 12 args <- getArgs 13 case args of 14 [input,output] -> interactWith myFun input output 15 _ -> putStrLn "Error: exactly two args needed" 16 myFun = id -- Replace with your function R. Casadei Haskell July 20, 2016 44 / 49
  45. 45. Real world Haskell Further readings “A tutorial on the universality and expressiveness of fold” http://www.cs.nott.ac.uk/~pszgmh/fold.pdf “Theorems for free” – http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9875 R. Casadei Haskell July 20, 2016 45 / 49
  46. 46. Real world Haskell Misc The special value undefined will typecheck no matter where you use it. Compare: 1 ghci> head [1,2,"a"] -- No instance for (Num [Char]) 2 ghci> head [1,2,undefined] -- 1 R. Casadei Haskell July 20, 2016 46 / 49
  47. 47. Articles Scrap Your Boilerplate: A Practical Design Pattern for Generic Programming Outline 1 Basic Haskell programming Basics Intermediate stuff 2 Real world Haskell 3 Articles Scrap Your Boilerplate: A Practical Design Pattern for Generic Programming R. Casadei Haskell July 20, 2016 47 / 49
  48. 48. Articles Scrap Your Boilerplate: A Practical Design Pattern for Generic Programming Summary Problem: consider programs that traverse data structures built from rich mutually-recursive data types Such programs often have a great deal of "boilerplate" code that simply walks the structure, hiding a small amount of "real" code that constitutes the reason for the traversal. Boilerplate code is tiresome to write, and easy to get wrong. Moreover, it is vulnerable to change. Example: increase salary to every person in a recursive data structure representing a company’s organizational structure. If the company’s organisation changes, so does every algorithm walking on it. Proposed solution: the paper describes a design pattern R. Casadei Haskell July 20, 2016 48 / 49
  49. 49. Appendix References References I Marlow, S. (2012). Parallel and concurrent programming in haskell. In Proceedings of the 4th Summer School Conference on Central European Functional Programming School, CEFP’11, pages 339–401, Berlin, Heidelberg. Springer-Verlag. O’Sullivan, B., Goerzen, J., and Stewart, D. (2008). Real World Haskell. O’Reilly Media, Inc., 1st edition. R. Casadei Haskell July 20, 2016 49 / 49

×