Thinking Functionally with Haskell (34 page)

BOOK: Thinking Functionally with Haskell
10.96Mb size Format: txt, pdf, ePub

This definition of
ones
creates an infinite, not a cyclic list. We could define
repeat x = xs where xs = x:xs

Now the function
repeat
is defined in terms of a cyclic list. The second definition (call it
repeat2
) is faster to evaluate than the first (call it
repeat1
) because there is less overhead:
ghci> last $ take 10000000 $ repeat1 1

1

(2.95 secs, 800443676 bytes)

ghci> last $ take 10000000 $ repeat2 1

1

(0.11 secs, 280465164 bytes)

As another example, consider the following three definitions of the standard prelude function
iterate
:

iterate1 f x = x:iterate1 f (f x)

iterate2 f x = xs where xs = x:map f xs

iterate3 f x = x:map f (iterate3 f x)

All three functions have type
(a -> a) -> a -> [a]
and produce an infinite list of the iterates of
f
applied to
x
. The three functions are equal, but the induction principle reviewed earlier doesn’t seem to be applicable in proving this assertion because there is no obvious argument on which to perform the induction. More on this later. The first definition is the one used in the standard prelude, but it does not create a cyclic list. The second definition does, and the third is obtained from the second by eliminating the
where
clause. Assuming
f x
can be computed in constant time, the first definition takes Θ(
n
) steps to compute the first
n
elements of the result, but the third takes Θ(
n
2
) steps:
iterate3 (2*) 1

= 1:map (2*) (iterate3 (2*1))

= 1:2:map (2*) (map (2*) (iterate3 (2*1)))

= 1:2:4:map (2*) (map (2*) (map (2*) (iterate3 (2*1))))

Evaluating the
n
th element requires
n
applications of
(2*)
, so it takes Θ(
n
2
) to produce the first
n
elements.

That leaves the second definition. Does it take linear or quadratic time? The evaluation of
iterate2 (2*) 1
proceeds as follows:
xs where xs = 1:map (2*) xs

= 1:ys where ys = map (2*) (1:ys)

= 1:2:zs where zs = map (2*) (2:zs)

= 1:2:4:ts where ts = map (2*) (4:ts)

Each element of the result is produced in constant time, so
iterate2 (2*) 1
takes Θ(
n
) steps to produce
n
elements.

Let us now develop a cyclic list to generate an infinite list of all the primes. To start with we define

primes = [2..] \\ composites

composites = mergeAll multiples

multiples = [map (n*) [n..] | n <- [2..]]

where
(\\)
subtracts one strictly increasing list from another:

(x:xs) \\ (y:ys) | x

| x==y = xs \\ ys

| x>y = (x:xs) \\ ys

Here,
multiples
consists of the list of all multiples of 2 from 4 onwards, all multiples of 3 from 9 onwards, all multiples of 4 from 16 onwards, and so on. Merging the list gives the infinite list of all the composite numbers, and taking its complement with respect to
[2..]
gives the primes. We saw the definition of
mergeAll
in the previous section.

So far, so good. But the algorithm can be made many times faster by observing that too many multiples are being merged. For instance, having constructed the multiples of 2 there is no need to construct the multiples of 4, or of 6, and so on. What we really would like to do is just to construct the multiples of the primes. That leads to the idea of ‘tying the recursive knot’ and defining

primes = [2..] \\ composites

where

composites = mergeAll [map (p*) [p..] | p <- primes]

What we have here is a cyclic definition of
primes
. It looks great, but does it work? Unfortunately, it doesn’t:
primes
produces the undefined list. In order to determine the first element of
primes
the computation requires the first element of
composites
, which in turn requires the first element of
primes
. The computation gets stuck in an infinite loop. To solve the problem we have to pump-prime (!) the computation by giving the computation the first prime explicitly. We have to rewrite the definition as

primes = 2:([3..] \\ composites)

where

composites = mergeAll [map (p*) [p..] | p <- primes]

But this still doesn’t produce the primes! The reason is a subtle one and is quite hard to spot. It has to do with the definition
mergeAll = foldr1 xmerge

The culprit is the function
foldr1
. Recall the Haskell definition:

foldr1 :: (a -> a -> a) -> [a] -> a

foldr1 f [x] = x

foldr1 f (x:xs) = f x (foldr1 xs)

The order of the two defining equations is significant. In particular,
foldr1 f (x:undefined) = undefined

because the list argument is first matched against
x:[]
, causing the result to be undefined. That means
mergeAll [map (p*) [p..] | p <- 2:undefined] = undefined

What we wanted was

mergeAll [map (p*) [p..] | p <- 2:undefined] = 4:undefined

To effect this change we have to define
mergeAll
differently:

mergeAll (xs:xss) = xmerge xs (mergeAll xss)

Now we have

mergeAll [map (p*) [p..] | p <- 2:undefined]

= xmerge (map (2*) [2..]) undefined

= xmerge (4:map (2*) [3..]) undefined

= 4:merge (map (2*) [3..]) undefined

= 4:undefined

This version of
mergeAll
behaves differently on finite lists from the previous one. Why?

With this final change we claim that
primes
does indeed get into gear and produces the primes. But how can the claim be proved? To answer this question we need to know something about the semantics of recursively defined functions and other values in Haskell, and how infinite lists are defined as limits of their partial approximations.

9.3 Infinite lists as limits

In mathematics, certain values are defined as
limits
of infinite sequences of approximations of simpler values. For example, the irrational number
π
= 3.14159265358979323846· · ·

can be defined as the limit of the infinite sequence of rational approximations

3, 3.1, 3.14, 3.141, 3.1415, . . .

The first element of the sequence, 3, is a fairly crude approximation to
π
. The next element, 3.1, is a little better; 3.14 is better still, and so on.

Similarly, an infinite list can also be regarded as the limit of a sequence of approximations. For example, the infinite list [1..] is the limit of the infinite sequence of partial lists ⊥, 1 :⊥, 1 : 2 :⊥, 1 : 2 : 3 :⊥, . . .

Again, the sequence consists of better and better approximations to the intended limit. The first term, ⊥, is the undefined element, and thus a very crude approximation: it tells us nothing about the limit. The next term, 1 :⊥, is a slightly better approximation: it tells us that the limit is a list whose first element is 1, but says nothing about the rest of the list. The following term, 1 : 2 :⊥, is a little better still, and so on. Each successively better approximation is derived by replacing ⊥ with a more defined value, and thus gives more information about the limit.

Here is another sequence of approximations whose limit is [1..]:

⊥, 1 : 2 :⊥, 1 : 2 : 3 : 4 :⊥, 1 : 2 : 3 : 4 : 5 : 6 :⊥, . . .

This sequence is a subsequence of the one above but it converges to the same limit.

Here is a sequence of approximations that does not converge to a limit:

⊥, 1 :⊥, 2 : 1 :⊥, 3 : 2 : 1 :⊥, . . .

The problem with this sequence is that it gives conflicting information: the second term says that the limit begins with 1. However, the third term says that the limit begins with 2, and the fourth term says that it begins with 3, and so on. No approximation tells us anything about the intended limit and the sequence does not converge.

It should not be thought that the limit of a sequence of lists is necessarily infinite. For example, the sequence ⊥, 1 :⊥, 1 : [ ], 1 : [ ], . . .

in which every element after the first two is [1], is a perfectly valid sequence with limit [1]. Similarly, ⊥, 1 :⊥, 1 : 2 :⊥, 1 : 2 :⊥, . . .

is a sequence with limit 1 : 2 :⊥. Finite and partial lists are limits of sequences possessing only a finite number of distinct elements.

The way to formalise the property that an infinite sequence of partial lists converges to a limit is to introduce the notion of an
approximation ordering
⊑ on the
elements of each type. The assertion
x

y
means that
x
is an approximation to
y
. The ordering ⊑ will be reflexive (
x

x
), transitive (
x

y
and
y

z
implies
x

z
), and anti-symmetric (
x

y
and
y

x
implies
x
=
y
). However, it is not the case that every pair of elements have to be comparable by ⊑. Thus ⊑ is what is known as a
partial
ordering. Note that ⊑ is a mathematical operator (like =), and not a Haskell operator returning boolean results.

The approximation ordering for numbers, booleans, characters and any other enumerated type, is defined by
x

y
≡ (
x
=⊥) ∨ (
x
=
y
).

The first clause says that ⊥ is an approximation to everything. In other words, ⊥ is the
bottom
element of the ordering. This explains why ⊥ is pronounced ‘bottom’. The value ⊥ is the bottom element of ⊑ for every type. The above ordering is
flat
. With a flat ordering one either knows everything there is to know about a value, or one knows absolutely nothing.

The approximation ordering on the type (
a
,
b
) is defined by ⊥⊑ (
x
,
y
) and (
x
,
y
) ⊑ (
x

,
y

) ≡ (
x

x

) ∧ (
y

y

).

The occurrences of ⊑ on the right refer to the orderings on the types
a
and
b
, respectively. The ordering ⊑ on (
a
,
b
) is not flat, even when the component orderings are. For example, in
(Bool,Bool)
we have the following chain of distinct elements: ⊥⊑ (⊥, ⊥) ⊑ (⊥,
False
) ⊑ (
True
,
False
).

Note that in Haskell the pair (⊥, ⊥) is distinct from ⊥:

ghci> let f (a,b) = 1

ghci> f (undefined,undefined)

1

ghci> f undefined

*** Exception: Prelude.undefined

The ordering ⊑ on
[a]
is defined by ⊥⊑
xs
and
(x:xs)

[]
and
[]

xs

xs
=
[]
,
(x:xs)

(y:ys)
≡ (
x

y
) ∧ (
xs

ys
).

These equations should be read as an inductive definition of a mathematical assertion, not as a Haskell definition. The second condition says that
[]
approximates only itself, and the third condition says that
(x:xs)
is an approximation to
(y:ys)
if and only if
x
is an approximation to
y
and
xs
is an approximation to
ys
. The first
occurrence of ⊑ on the right-hand side refers to the approximation ordering on the type
a
.

As two examples, we have

[1, ⊥, 3] ⊑ [1, 2, 3] and 1 : 2 :⊥⊑ [1, 2, 3].

However, 1 : 2 :⊥ and [1, ⊥, 3] are not related by ⊑.

The approximation ordering for each type
T
is assumed to have another property in addition to those described above: each
chain
of approximations
x
0

x
1
⊑. . . has to possess a limit which is also a member of
T
. The limit, which we denote by lim
n
→∞
x
n
, is defined by two conditions: 1.
x
n
⊑ lim
n
→∞
x
n
for all
n
. This condition states that the limit is an
upper bound
on the sequence of approximations.

2. If
x
n

y
for all
n
, then lim
n
→∞
x
n

y
. This condition states that the limit is the
least
upper bound.

The definition of the limit of a chain of approximations applies to every type. Partial orderings possessing this property are called
complete
, and every Haskell type is a complete partial ordering (CPO for short). In particular, the property, introduced in
Chapter 6
, of a mathematical assertion
P
being chain complete can now be formalised as

Other books

Lady Em's Indiscretion by Elena Greene
Thank You Notes by Fallon, Jimmy, the Writers of Late Night
The Call by Michael Grant
Wish List by Fern Michaels
Not QUITE the Classics by Colin Mochrie