Read Thinking Functionally with Haskell Online
Authors: Richard Bird
If we like we can declare a precedence level and an order of association for
(+++)
, but we won’t spell out how.
Sections and lambda expressions
It is a matter of style, but in the main we prefer to write scripts in which all the little helper functions are named explicitly. Thus if we need a function that adds 1 to a number, or doubles a number, then we might choose to name such functions explicitly:
succ, double :: Integer -> Integer
succ
n = n+1
double n = 2*n
However, Haskell provides alternative ways of naming these two functions, namely
(+1)
and
(2*)
. The device is called a
section
. In a section one of the arguments of an operator is included along with the operator. Thus
(+1) n = n+1
(0<) n = 0
(<0) n = n<0
(1/) x = 1/x
Sections are certainly attractive ways of naming simple helper functions and we henceforth accept them onto our list of Good Things to Use in Moderation.
There is one important caveat about sections: although
(+1)
is the section that adds 1 to a number,
(-1)
is
not
the section that subtracts 1. Instead
(-1)
is just the number −1. Haskell uses the minus sign both as the binary operation of subtraction and as a prefix to denote negative numbers.
Now suppose we want a function that doubles a number and then adds 1 to the answer. This function is captured by the composition
(+1) . (*2)
of two sections. But the result is unsatisfying because it looks a little abstruse; anyone reading it would have to pause for a moment to see what it meant. The alternative seems to be to give the function a name, but what would be a suitable name? Nothing helpful really comes to mind.
The alternative is to use a
lambda expression
\n -> 2*n+1
. It is called a lambda expression because mathematically the function would be written as λ
n.
2∗
n
+1. Read the expression as ‘that function of
n
which returns 2∗
n
+1’. For example,
ghci> map (\n -> 2*n+1) [1..5]
[3,5,7,9,11]
Once in a while a lambda expression seems the best way to describe some function, but only once in a while and we will take them out of the box only on rare occasions.
Haskell evaluates an expression by reducing it to its simplest possible form and printing the result. For example, suppose we have defined
sqr :: Integer -> Integer
sqr x = x*x
There are basically two ways to reduce the expression
sqr (3+4)
to its simplest possible form, namely
49
. Either we can evaluate
3+4
first, or else apply the definition of
sqr
first:
sqr (3+4) | sqr (3+4) |
= sqr 7 | = let x = 3+4 in x*x |
= let x = 7 in x*x | = let x = 7 in x*x |
= 7*7 | = 7*7 |
= 49 | = 49 |
The number of reduction steps is the same in each case, but the order of the reduction steps is slightly different. The method on the left is called
innermost reduction
and also
eager evaluation
; the one on the right is called
outermost reduction
or
lazy evaluation
. With eager evaluation arguments are always evaluated before a function is applied. With lazy evaluation the definition of a function is installed at once and only when they are needed are the arguments to the function evaluated.
Doesn’t seem much of a difference, does it? But consider the following (slightly abbreviated) evaluation sequences concerning the function
fst
that returns the first element of a pair, so
fst (x,y) = x
:
fst (sqr 1,sqr 2) | fst (sqr 1,sqr 2) |
= fst (1*1,sqr 2) | = let p = (sqr 1,sqr 2) |
= fst (1,sqr 2) | in fst p |
= fst (1,2*2) | = sqr 1 |
= fst (1,4) | = 1*1 |
= 1 | = 1 |
The point here is that under eager evaluation the value
sqr 2
is computed, while under lazy evaluation that value is not needed and is not computed.
Now suppose we add the definitions
infinity :: Integer
infinity = 1 + infinity
three :: Integer -> Integer
three x = 3
Evaluating
infinity
will cause GHCi to go into a long, silent think trying to compute
1 + (1 + (1 + (1 + (1 + ....
until eventually it runs out of space and returns an error message. The value of
infinity
is ⊥.
Again there are two ways to evaluate
three infinity
:
three infinity | three infinity |
= three (1+infinity) | = let x = infinity in 3 |
= three (1+(1+infinity)) | = 3 |
= ... |
Here eager evaluation gets stuck in a loop trying to evaluate
infinity
, while lazy evaluation returns the answer 3 at once. We don’t need to evaluate the argument of
three
in order to return
3
.
One more definition, a version of the factorial function:
factorial :: Integer -> Integer
factorial n = fact (n,1)
fact :: (Integer,Integer) -> Integer
fact (x,y) = if x==0 then y else fact (x-1,x*y)
This is another example of a
recursive definition
(the definition of
infinity
was also recursive, and so was the function
song
in the previous chapter). Expressions involving recursive functions are evaluated like any other definition.
Here the two evaluation schemes result in the following sequence of reduction steps (we hide the steps involving simplification of the conditional expression to make another point):
factorial 3 | factorial 3 |
= fact (3,1) | = fact (3,1) |
= fact (3-1,3*1) | = fact (3-1,3*1) |
= fact (2,3) | = fact (2-1,2*(3*1)) |
= fact (2-1,2*3) | = fact (1-1,1*(2*(3*1))) |
= fact (1,6) | = 1*(2*(3*1)) |
= fact (1-1,1*6) | = 1*(2*3) |
= fact (0,6) | = 1*6 |
= 6 | = 6 |
The point to appreciate is that, while the number of reduction steps is basically the same, lazy evaluation requires much more space to achieve the answer. The expression
1*(2*(3*1))
is built up in memory before being evaluated.
The pros and cons of lazy evaluation are briefly as follows. On the plus side, lazy evaluation terminates whenever
any
reduction order terminates; it never takes more steps than eager evaluation, and sometimes infinitely fewer. On the minus side, it can require a lot more space and it is more difficult to understand the precise order in which things happen.
Haskell uses lazy evaluation. ML (another popular functional language) uses eager evaluation. Exercise D explores why lazy evaluation is a Good Thing. Lazy evaluation is considered further in
Chapter 7
.
A Haskell function
f
is said to be
strict
if
f undefined = undefined
, and
nonstrict
otherwise. The function
three
is nonstrict, while
(+)
is strict in both arguments. Because Haskell uses lazy evaluation we can define nonstrict functions. That is why Haskell is referred to as a
nonstrict
functional language.
Haskell has built-in (or primitive) types such as
Int
,
Float
and
Char
. The type
Bool
of boolean values is defined in the standard prelude:
data Bool = False | True
This is an example of a
data declaration
. The type
Bool
is declared to have two data
constructors
,
False
and
True
. The type
Bool
has three values, not two:
False
,
True
and
undefined :: Bool
. Why do we need that last value? Well, consider the function
to :: Bool -> Bool
to b = not (to b)
The prelude definition of
not
is
not :: Bool -> Bool
not True
= False
not False = True
The definition of
to
is perfectly wellformed, but evaluating
to True
causes GHCi to go into an infinite loop, so its value is ⊥ of type
Bool
. We will have much more to say about data declarations in future chapters.
Haskell has built-in compound types, such as
[Int]
a list of elements, all of type
Int
(Int,Char)
a pair consisting of an
Int
and a
Char
(Int,Char,Bool)
a triple
()
an empty tuple
Int -> Int
a function from
Int
to
Int
The sole inhabitant of the type
()
is also denoted by
()
. Actually, there is a second member of
()
, namely
undefined :: ()
. Now we can appreciate that there is a value ⊥ for every type.
As we have already said, when defining values or functions it is always a good idea to include the type signature as part of the definition.
Consider next the function
take n
that takes the first
n
elements of a list. This function made its appearance in the previous chapter. For example,
take 3 [1,2,3,4,5] | = [1,2,3] |
take 3 "category" | = "cat" |
take 3 [sin,cos] | = [sin,cos] |
What type should we assign to
take
? It doesn’t matter what the type of the elements of the list is, so
take
is what is called a
polymorphic
function and we denote its type by
take :: Int -> [a] -> [a]
The
a
is a
type variable
. Type variables begin with a lowercase letter. Type variables can be instantiated to any type.
Similarly,
(++) :: [a] -> [a] -> [a]
map
:: (a -> b) -> [a] -> [b]
(.)
:: (b -> c) -> (a -> b) -> (a -> c)
The last line declares the polymorphic type of functional composition.
Next, what is the type of
(+)
? Here are some suggestions:
(+) :: Int -> Int -> Int
(+) :: Float -> Float -> Float
(+) :: a -> a -> a
The first two types seem too specific, while the last seems too general: we can’t add two functions or two characters or two booleans, at least not in any obvious way.
The answer is to introduce
type classes
:
(+) :: Num a => a -> a -> a
This declaration asserts that
(+)
is of type
a -> a -> a
for any
number type
a
. A type class, such as
Num
, has a collection of named methods, such as
(+)
, which can be defined differently for each instance of the type class. Type classes therefore provide for
overloaded
functions, functions with the same name but different definitions. Overloading is another kind of polymorphism.
Numbers are rather complicated, and are explained in more detail in the following chapter, so we illustrate type classes with a simpler type class
class Eq a where
(==),(/=) :: a -> a -> Bool
x /= y = not (x == y)
This introduces the Equality type class, members of which can use one and the same equality test
(==)
and inequality test
(/=)
. There is a
default
definition of
(/=)
as part of the class, so we only have to provide a definition of
(==)
.
To become a member of the
Eq
club we have to define an
instance
. For example,
instance Eq Bool where
x == y = if x then y else not y
instance Eq Person where
x == y = (pin x == pin y)