Organisation: | Copyright (C) 2021-2022 Olivier Boudeville |
---|---|
Contact: | about (dash) curry (at) esperide (dot) com |
Creation date: | Tuesday, September 7, 2021 |
Lastly updated: | Wednesday, September 7, 2022 |
Version: | 0.0.1 |
Status: | Work in progress |
Dedication: | Anyone wanting to discover the Haskell programming language. |
Abstract: | The purpose of this Curry cookbook is to help newcomers getting up to speed with functional programming as done based on the Haskell language. |
The latest version of this documentation is to be found at the official Curry website (http://curry.esperide.org).
This Curry documentation is also available in the PDF format, see Ceylan-Curry-cookbook-english.pdf.
Table of Contents
The purpose of this Curry cookbook is to help newcomers getting up to speed with functional programming when relying on the Haskell language [1] for that.
[1] | This cookbook is in some way a Haskell counterpart of what we did for Erlang, with the software stack whose first layer is Ceylan-Myriad. |
More precisely, this cookbook is to summarise the various elements that we found useful to remember when wanting to program in Haskell. We hope that it may be useful whereas either one never really practiced that art or already forgot essential elements of it.
So the goal of Curry is to be quicker to read/browse than it would be to start the learning again from scratch, and never reaching the latter parts thereof (knowing that the learning curve of Haskell is unfortunately rather steep).
Note
As this cookbook is being written as we are in the process of learning Haskell, errors, misconceptions and epic blunders are bound to occur in this text; if you detect such issues, please contact us so that we can correct this document accordingly. Thanks in advance!
The ≡ character denotes here equivalence of expressions; ex: 2 * x ≡ x + x.
For clarity, the formal characters like λ, ≫ and → are replaced by the one that you would type (namely \, >> and ->).
A programming style based on the application of functions (more information).
The (maximal) number of arguments expected by a function.
Precedence allows to define the order of the operations.
If the precedence of op1 is higher than the one of op2, then x op1 y op2 z shall be read as: (x op1 y) op2 z.
See the precedence table for all Haskell operators.
By using the :info GHCi command, the precedence levels of operators can be returned:
Prelude> :info + type Num :: * -> Constraint class Num a where (+) :: a -> a -> a ... -- Defined in ‘GHC.Num’ infixl 6 +
This determines how operators of the same precedence are grouped in the absence of parentheses.
Operators may be:
Operators are just functions that can be called:
Being central in FP, function application is symbolised just by a space.
It behaves as the operator of the highest precedence; therefore f a + b ≡ (f a) + b
(rather than f (a + b)) [2].
[2] | Another example: an expression that could be described informally as f(a,b) + c.d translates in Haskell as f a b + c*d or ((f a) b) + c*d". |
This pseudo-operator is left-associative: f g h ≡ ((f g) h).
f g x corresponds to the application of a function g and of a variable x at f (i.e. f(g,x)); if wanting to express f(g(x)), rely on f (g x) or, even better, on f.g x.
This operator, named cons (for construct), allows to define lists by appending successively elements, starting from the empty list ([]; designated as nil).
The : operator is right-associative:
x:y:z:l ≡ x:(y:(z:l))
Example:
[4,5,3] = 4:5:3:[]
It allows to specify the datatypes involved in the type definition of a function.
This operator is right-associative:
A -> B -> C -> ≡ A -> (B -> (C -> D))
Example:
-- mult :: Int -> (Int -> (Int -> Int)) mult :: Int -> Int -> Int -> Int -- mult x y z = ((mult x) y) z mult x y z = x*y*z
The . operator applies to functions, and returns the composition of two functions as a single one: (f . g) x = f (g x).
(.) :: (b -> c) -> (a -> b) -> (a -> c) f . g = \x -> f (g x)
Composition is associative: f . (g . h) = (f . g) . h.
Compositions allow to think in terms of functions and abstract out values.
An example to express the composition of a list of functions:
compose :: [a -> a] -> (a -> a) compose = foldr (.) id
The $ operator is another way, besides the usual function application (denoted by a space) , of applying arguments to a function.
This is the operator of least precedence. It has been introduced in order to further avoid the use of parentheses:
a $ b op c ≡ a (b op c)
When a $ is encountered, the expression on its right is applied as the argument to the function on its left, as if a virtual parenthesis was opened there, and a closing one added at the end of the expression.
This operator is right-associative: f $ g $ h ≡ (f $ (g $ h)).
So this operator is the "opposite" of the base function application in terms of precedence and associativity - but otherwise is the same in terms of definition (including typing):
($) :: (a -> b) -> a -> b f $ x = f x
An association from a set of arguments to a set of corresponding results.
double x = 2*x sum :: Num a => [a] -> a sum [] = 0 sum (n:ns) = n + sum ns
A function may not be defined for all (well-typed) combinations of its arguments.
For example head [] is to throw an exception.
A function is a value like the other datatypes (first-class citizen).
Thanks to the algebraic datatypes such as lists and tuples, a function may take and return an arbitrary number of values (thus including functions).
A function may be only partially applied: its last argument(s) may not specified in a given call. Then the value of the overall expression is a function of these remaining, unspecified arguments.
For example, if f :: (Int,Float) -> Bool then f 3 :: Float -> Bool.
By default functions are polymorphic, insofar as they will accept all types determined as compatible. For example zip :: [a] -> [b] -> [(a,b)] can handle any types for a and b.
Functions are defined from expressions and based on pattern matching, which operates on few possible patterns that are examined in turn, from top to bottom:
(&&) :: Bool -> Bool -> Bool True && b = b False && _ = False
fst :: (a,b) -> a fst (x,_) = x
tail :: [a] -> [a] tail (_:xs) = xs
The ‘_’ character acts as a "wildcard" (it matches any value).
Note that a non-empty list is (a bit surprisingly) to be pattern-matched with (x:xs) and not [x:xs]. For example:
product :: Num a => [a] -> a product [] = 1 product (n:ns) = n * product ns
The same name cannot be specified for two arguments to match when they are equal; a guard must be used for that. Finally pattern matching can operate on few possible patterns: function application, tuple and list ones, that's it.
Functions can be defined using guarded equations, to select which clause applies based on the first one from the top to evaluate to True. As Haskell can guarantee that these functions are pure, guards can be user-defined (rather than be taken in a limited selection of built-in guards).
For example:
abs | n >= 0 = n | otherwise = -n
Guards are also logical expressions to filter the values produced by earlier generators.
For example:
primes :: Int -> [Int] primes n = [ x | x <- [2..n], prime x]
A function taking its arguments one by one (one at a time, from left to right), each time returning a function of a decremented arity, is said to be curried.
Such 1-arity functions are those directly modelled by the lambda calculus, an essential base of the functional languages.
As the function arrow operator (->) is right associative, the type of functions is preferably written as:
f :: TArg1 -> TArg2 -> ... -> TArgn -> TResult
This corresponds to the following actual type:
f :: TArg1 -> ( TArg2 -> (... -> (TArgn -> TResult))...)
Consequently, as their arguments are to be applied one by one from left to right, the `` `` operator (function application is represented by the space character) is right-associative:
f a1 a2 ... an ≡ (((f a1) a2) ... an)
Functions can thus all be seen as curried ones, and can also be directly transposed as lambda ones; for example the following definitions define the same function:
add :: Int -> Int -> Int add x y = x + y add :: Int -> (Int -> Int) add = \x -> (\y -> x + y)
In this latter form:
[3] | This is certainly clearer yet, if no type specification is given, the arity of the function is only implicit then. |
They correspond to all the consequences that the evaluation of a code (program, function) incurs besides its returned value.
Example:
These impure events are difficult to manage yet are generally necessary, as the purpose of a program is to trigger "interesting side-effects"; a strictly pure program would most probably have no actual use (except using time, processing resources and adding the possibility of failure).
A type is a set of associated values.
e :: T means that expression e is of type T.
Haskell infers types at compilation (static typing), which prevents to discover many problems [4] at runtime.
[4] | Of course other problems may occur; for example 1 `div` 0 is well-typed yet its execution will fail. |
Some expressions that could be successfully evaluated can nevertheless be rejected on type grounds (ex: if True then 1 else False), but in practice it is hardly a problem.
Ex: type Pos = (Int,Int)
This does not introduce a new type; these are synonyms for Haskell.
Type declarations can have any number of parameters and be parametrised by other types, like in type Assoc k v = [(k,v)].
This introduces a new type; ex: data Bool = False | True.
| shall be read "or", and the values specified for the new type are called constructors. They start with an uppercase letter and a given constructor may not be used in more than one type.
Constructors may have arguments, in which case they are constructor functions:
data Shape = Circle Float | Rect Float Float
Here Circle :: Float -> Shape.
Such functions have no definition equations, and Circle 1.0 is fully evaluated and cannot be further simplified.
The ordering on the constructors of a type is determined by their position in its declaration (ex: False < True; constructors with arguments are ordered lexicographically regarding these argments.
They may also be:
For example all sorts of trees can be defined with it:
data Tree a = Leaf a | Node (Tree a) (Tree a)
data Tree a = Leaf | Node (Tree a) a (Tree a)
data Tree a b = Leaf a | Node (Tree a b) b (Tree a b)
data Tree a = Leaf a | Node a [Tree a]
Introduces a new type provided that it has a single constructor with a single argument.
Ex: newtype Nat = N Int.
By comparison:
Why data is not automatically transformed by Haskell internally in newtype whenever applicable is unsure (for us).
They are defined based on unions of values.
For example, in:
data MyFirstType = FirstValue | SecondValue | FirstConstructor T
MyFirstType is the name of the type. FirstValue and SecondValue are (constructor) values.
FirstConstructor is a constructor, and T is a type name (not a constructor).
Type and constructor names can be the same, as no ambiguity can occur.
Polymorphic types can be defined:
data MySecondType a = ThirdValue | SecondConstructor a Int
Example: `a`.
A list is an arbitrary-sized, homogeneous container.
Example:
l1 = [1.0, 7.0, 2.0] l2 = [] l3 = [100..] l4 = 1 : 2 : 3 : []
Like the map, the various folds encapsulate a classical recursion pattern.
foldr (r for right fold) evaluates from right to left, uses a right-associative operator and is directly recursive:
foldr :: (a -> b -> b) -> b -> [a] -> b foldr f v [] = v foldr f v (x:xs) = f x (foldr f v xs)
foldl (l for left fold) evaluates from left to right, uses a left-associative operator and relies on an accumulator:
foldl :: (a -> b -> a) -> a -> [b] -> a foldl f v [] = v foldl f v (x:xs) = foldl f (f v x) xs
A string is nothing but a list of (Unicode) characters: String ≡ [Char].
Example: "Hello!" or `a`:`b`:`c`:[]
A string cannot spread over multiple lines directly; the backslash (\) character is needed for that:
s = "This is a unique \ \line."
(any text between the two \ is ignored)
A newline is designated in a string by \n.
A type class is a set of types that support a set of functions called methods.
For example the Eq class is to gather all types that can be compared for egality (or inegality), based on the following two methods:
(==) :: a -> a -> Bool (/=) :: a -> a -> Bool
A type may belong to one or more classes.
Class constraints can be specified when typing a function. For example:
(+) :: Num a => a -> a -> a
The type variable a must be here an instance of class Num.
A type respecting at least one of such constraints is said overloaded, as is the corresponding expression.
Well-known classes are: Eq, Ord, Show, Read, Num, Integral, Fractional.
Various intuitive descriptions of a monad apply:
A monad M is a (type, return function, bind function) triplet with:
[5] | A monadic function is a function returning a monadic value. |
So return allows to convert a pure expression (not involving side-effects) of type t into an action (i.e. a transformation that may be executed) M t that is impure (with side effects) and deals with a value of type t. No opposite operation exists (that would transform a value of type type M t into one of type t, in the sense that side-effects cannot be rolled-back. A goal is thus to separate the pure parts of a program from its impure ones, which shall be minimised and isolated.
Mt >>= f (i.e. "bind Mt f") allows to apply the f function to the value of type t encapsulated in Mt; >>= drives the chaining of monadic functions (as such it handles functions, not values).
By composing >>= with return, any function g :: t -> t can be applied to a monad of type M t (here the types u and t in the definition of bind are the same).
These functions (f et g) are only aware of the value (t) encapsulated in the monad, not the monad (M t) itself.
So the developer composes a sequence of function calls (a pipeline) by chaining binds in an expression. Functions transform the values that they receive ; then the bind operator controls the returned monadic values (ex: it can enrich them outside of the view of these chained functions) and the next calls (ex: it can make them conditional).
So a monad of type M t is an algebraic datatype that derives from type t.
The elements of the monadic triplet shall respect following 3 axioms, return behaving like a neutral element from >>=:
A sequence of actions (transformations that may be executed) may be combined into a single composite action thanks to the do notation.
Let's consider types Vk for k in 1..n and R, and a function f of arity n and type V1 -> V2 -> ... -> Vn -> R.
do v1 <- a1 v2 <- a2 ... vn <- an return (f v1 v2 ... vn)
The <- operator in v1 <- a1, for v1 a value of type V1, means that the action (transformation function) a1 of type M V1 (i.e., if M is IO, of type WorldState -> (V1,WorldState)) shall be executed, and that its result (of type V1) shall be stored in variable v1. Then the (pure) function f can be evaluated with these resulting vk arguments, and its returned value of type R is returned as an action of type M R (ex: IO Int).
As a consequence, the do notation returns a new action, i.e. a single composite transformation that may be executed.
Refer to this article for more information.
A monad M defines a type that represents a processing and is associated to two operators:
The IO monad is probably the most well-known monad. IO t represents an imperative program taking no parameter and returning a value of type t.
For example:
A second example is the one of Maybe, as taken from this article
The Maybe-type (so here M = Maybe) is: Maybe t :: Just t | Nothing
The Maybe-return is:
return :: t -> Maybe t return undefined = Nothing return O -> Just O
The Maybe-bind is:
(>>=) :: Maybe t -> ( t -> Maybe u ) -> Maybe u Nothing ```>>=``` f = Nothing (Maybe a) ```>>=``` f = Just( f a )
The Maybe monad allows to handle errors: as soon as a function call fails, the next binds short-circuit the processing rather than letting the next functions be evaluated in turn.
A third example is seqn, to transform a list of actions with side-effects (ex: IO) returning a result of type a into a single of such action:
seqn :: Monad m => [m a] -> m [a] seqn [] = return [] seqn (act:acts) = do x <- act -- *performs* this 'act' action xn <- seqn acts return (x:xs)
See also this section about monad applications.
Most languages perform strict, eager evaluation: to evaluate a function call, first each of the supplied argument is fully evaluated, then the function itself is evaluated, based on the values jst computed for its arguments.
On the contrary, lazy evaluation strives to defer as much as possibly evaluations - possibly delaying up to the point of having never to perform them.
Lazy evaluation is convenient to express higher-level programs (ex: handling infinite lists such as [1,...]), yet makes it harder to predict the behaviour of programs at least in terms of resource consumption (time, memory, etc.).
For an increased control, strict evaluation can nevertheless be forced.
An arrow (a.k.a. "bolt") is a type class representing computations in a pure way; this monad generalisation allows to express the relationships between the logical steps of a processing.
Refer to this article for more information.
The lambda calculus is a simple yet powerful theory of functions. Its basis is to consider that all program elements are functions. An expression may include functions that are not yet defined and that are then considered as variables.
This is a formal system designed by Alonzo Church in the 1930s to define the concepts of function and (function) application. It relies on λ-expressions, where λ denotes the binding of a variable. For example, if M is a λ-expression, then λx.M is also one, and represents the function that to the x variable associates M (i.e. \x -> M x here).
The λ-calculus has been the first formalism defining and characterising the recursive functions, and as such is essential to the theory of computing.
It is used as theoretical programming language and also as metalanguage for formal proof.
λ-calculus may or may not be typed.
Refer to this article for more information.
They cannot be used to name functions or variables:
case, class, data, deriving, do, else, if, import, in, infix, infixl, infixr, instance, let, of, module, newtype, then, type, where
A single-line comment starts with -- and extends to the end of the line.
Multi-line comments start with {- and extend to -}.
Example:
size = 3 -- This is a constant here. {- Comments are essential for understanding. Consider writing them. -}
Comments can be nested.
See also the comment conventions for the Haddock documentation generator.
Literate programming (where text is by default a comment, unless being specifically designated as code) is possible in an Haskell script, whose extension shall then be .lhs (for Literate Haskell).
They include a compiler (GHC) and an interpreter (GHCi).
Haddock generates documentation from annotated Haskell source code (typically libraries).
So that they are taken into account by Haddock, comments above function definitions should start with {- |, and those next to parameter types with -- ^.
Example of use:
-- |The 'square' function squares an integer. -- It takes one argument, of type 'Int'. square :: Int -> Int square x = x * x {-| The 'cube' function cubes an integer. It takes one argument, of type 'Int'. -} cube x = x * square x data T a b = C1 a b -- ^ This is the documentation for the 'C1' constructor | C2 a b -- ^ This is the documentation for the 'C2' constructor
See this page for more information regarding markup.
One should avoid tabulations (i.e. prefer spaces).
Generally the least number of spaces is preferred; like in: sum $ map sqrt [1..10].
The names of functions and arguments start with a lowercase character and are followed by any number of: numbers, letters (of any case), underscores and single quotes.
The names of types and constructors start with an uppercase letter.
Most Haskell developers seem to strongly dislike typing, often resulting in cryptic names for functions (ex: fst) and variable names (often reduced to a single character whose meaning is never disclosed).
A lot of efforts went especially to save keystrokes and more precisely to remove as much as possible the need for parentheses (function application being denoted as a space, the . and $ operators being introduced, etc.). This allows very compact definitions of functions out of functions (ellipsing values as such), and an increased expressivity, sometimes at the expense of clarity. Extra documentation may alleviate this problem.
Single-letter variable names often denote their type (ex: c for a Char variable), and are suffixed by s to indicate this is a list thereof (ex: bs for a variable of type [Bool], css for a [[Char]] one).
Indentation matters, as spaces denote scopes:
a = b + c where b = 1 c = 2 d = a * 2
An alternative layout based on curly braces and semi-colons exists, yet its use is discouraged.
A function body shall be indented of at least one space compared to the function name (if indented at all).
As any where or let use must be, indentation-wise, between the name and the body of a function, we prefer, compared to the function name:
For example:
square x = sq where sq = x * x
On Arch Linux (see this page for more information): pacman -Sy ghc cabal-install stack.
See also our corresponding script for continuous integration.
Except for performances, programs can be tested directly with ghci, like in:
$ ghci Foobar.hs
The very essential GHCi commands are:
For example:
$ ghci GHCi, version 8.10.5: https://www.haskell.org/ghc/ :? for help Prelude> :type not not :: Bool -> Bool
Functions can be directly redefined:
Prelude> fac n = product [1..n] Prelude> fac 5 120 Prelude> fac _ = 0 Prelude> fac 5 0
One may rely on the Haskell transposition of our simple, usual make-based build system (refer to the GNUmake* files; see this section of our Erlang-based Myriad counterpart for more details).
We would certainly recommend browsing the pleasant Learn You a Haskell for Great Good! website or, even better, buying their book; another worthwhile book is Programming in Haskell, whose author is Graham Hutton, that we found interesting and well-written as well.
Of course the official Haskell website is also of interest.
Bugs, questions, remarks, patches, requests for enhancements, etc. are to be reported to the project interface (typically issues) or directly at the email address mentioned at the beginning of this cookbook.
If you have information more detailed or more recent than those presented in this document, if you noticed errors, neglects or points insufficiently discussed, drop us a line! (for that, follow the Support guidelines).