Thinking Functionally with Haskell (20 page)

BOOK: Thinking Functionally with Haskell
12.99Mb size Format: txt, pdf, ePub

f undefined
undefined

So the first condition is that
f
is a strict function.

Case
[]

f (foldr g a [])
foldr h b []

=
{
foldr.1
}
=
{
foldr.1
}

f a
b

The second condition is that
f a = b
.

Case
x:xs

f (foldr g a (x:xs))
foldr h b (x:xs)
=
{
foldr.2
}
=
{
foldr.2
}

f (g x (foldr g a xs))
h x (foldr h b xs)
=
{induction}

h x (f (foldr g a xs))

The third condition is met by
f (g x y) = h x (f y)
for all
x
and
y
.

Let us apply the fusion law to show that

foldr f a . map g = foldr h a

Recall that
map g = foldr ((:) . g) []
. Looking at the conditions of the fusion law we have that
foldr f a undefined = undefined

foldr f a []
= a

So the first two fusion conditions are satisfied. The third one is

foldr f a (g x:xs) = h x (foldr f a xs)

The left-hand side simplifies to

f (g x) (foldr f a xs)

so we can define
h x y = f (g x) y
. More briefly,
h = f . g
. Hence we have the useful rule:
foldr f a . map g = foldr (f . g) a

In particular,

double . sum = sum . map double

= foldr ((+) . double) 0

 

length . concat = sum . map length

= foldr ((+) . length) 0

Other simple consequences of the fusion law are explored in the exercises.

A variant

Sometimes having the empty list around is a pain. For example, what is the minimum element in an empty list? For this reason, Haskell provides a variant on
foldr
, called
foldr1
, restricted to nonempty lists. The Haskell definition of this function is
foldr1 :: (a -> a -> a) -> [a] -> a

foldr1 f [x]
= x

foldr1 f (x:xs) = f x (foldr1 f xs)

So we can define

minimum, maximum :: Ord a => [a] -> a

minimum = foldr1 min

maximum = foldr1 max

and avoid two other explicit recursions. Actually the Haskell definition of
foldr1
is not as general as it should be, but we will leave that discussion to an exercise.

6.4 The function
foldl

Recall that

foldr (@) e [w,x,y,z] = w @ (x @ (y @ (z @ e)))

Sometimes a more convenient pattern for the right-hand side is

(((e @ w) @ x) @ y) @ z

This pattern is encapsulated by a function
foldl
(fold from the left):
foldl :: (b -> a -> b) -> b -> [a] -> b

foldl f e [] = e

foldl f e (x:xs) = foldl f (f e x) xs

As an example, suppose we are given a string, such as
1234.567
, representing a real number and we want to compute its integer part and fractional part. We could define
ipart :: String -> Integer

ipart xs = read (takeWhile (/= '.') xs) :: Integer

 

fpart :: String -> Float

fpart xs = read ('0':dropWhile (/= '.' xs) :: Float

This uses the function
read
of the type class
Read
. Note by the way that
.567
is not a well-formed literal in Haskell. It is necessary to include at least one digit before and after the decimal point to ensure that the decimal point cannot be mistaken for functional composition. For example,
ghci> :t 3 . 4

3 . 4 :: (Num (b -> c), Num (a -> b)) => a -> c
As an alternative, we can define

parts :: String -> (Integer,Float)
parts ds
= (ipart es,fpart fs)

where (es,d:fs) = break (== '.') ds

ipart
= foldl shiftl 0 . map toDigit
where shiftl n d = n*10 + d

fpart
= foldr shiftr 0 . map toDigit
where shiftr d x = (d + x)/10

toDigit d = fromIntegral (fromEnum d - fromEnum '0')

We have

1234
= 1*1000 + 2*100 + 3*10 + 4

= (((0*10 + 1)*10 + 2)*10 + 3)*10 + 4

0.567
= 5/10 + 6/100 + 7/1000

= (5 + (6 + (7 + 0)/10)/10)/10

so use of
foldl
for the integer part and
foldr
for the fractional part are both indicated.

Here is another example. The function
reverse
was defined above by the equations
reverse []
= []

reverse (x:xs) = reverse xs ++ [x]

We are wiser now and would now write

reverse = foldr snoc []

where snoc x xs = xs ++ [x]

But a little learning is a dangerous thing: both definitions of
reverse
are terrible because they take of the order of
n
2
steps to reverse a list of length
n
. Much better is to define
reverse = foldl (flip (:)) []

where
flip f x y = f y x
. The new version reverses a list in linear time:
foldl (flip (:)) [] [1,2,3]

= foldl (flip (:)) (1:[]) [2,3]

= foldl (flip (:)) (2:1:[]) [3]

= foldl (flip (:)) (3:2:1:[]) []

= 3:2:1:[]

That seems a bit of a trick, but there is a sound principle at work behind this new definition that we will take up in the following chapter.

As this example suggests, there are the following relationships between
foldr
and
foldl
: for all finite lists
xs
we have
foldl f e xs = foldr (flip f) e (reverse xs)

foldr f e xs = foldl (flip f) e (reverse xs)

Proofs are left as an exercise. Note the restriction to finite lists, even though both sides reduce to ⊥ when
xs
is ⊥. That means the proofs have to rely on a subsidiary result that is true only for finite lists.

Here is another relationship between the two folds:

foldl (@) e xs = foldr (<>) e xs

for all finite lists
xs
, provided that

(x <> y) @ z = x <> (y @ z)

e @ x = x <> e

Again, the proof is left as an exercise. As one instructive application of this law, suppose
(<>) = (@)
and
(@)
is associative with identity
e
. Then the two provisos are satisfied and we can conclude that
foldr (@) e xs = foldl (@) e xs

for all finite lists
xs
whenever
(@)
is associative with identity
e
. In particular,
concat xss = foldr (++) [] xss = foldl (++) [] xss

for all finite lists
xss
. The two definitions are
not
the same if
xss
is an infinite list:
ghci> foldl (++) [] [[i] | i <- [1..]]

Interrupted.

ghci> foldr (++) [] [[i] | i <- [1..]]

[1,2,3,4,{Interrupted}

In response to the first expression, GHCi went into a long silence that was interrupted by pressing the ‘Stop program execution’ button. In response to the second, GHCi started printing an infinite list.

OK, so the definition in terms of
foldr
works on infinite lists, but the other one doesn’t. But maybe the definition of
concat
in terms of
foldl
leads to a more efficient computation when all the lists are finite? To answer this question, observe that
foldr (++) [] [xs,ys,us,vs]

= xs ++ (ys ++ (us ++ (vs ++ [])))

foldl (++) [] [xs,ys,us,vs]

= (((([] ++ xs) ++ ys) ++ us) ++ vs)

Let all the component lists have length
n
. The first expression on the right takes 4
n
steps to perform all the concatenations, while the second takes 0 +
n
+(
n
+
n
) + (
n
+
n
+
n
) =6
n
steps. Enough said, at least for now.

6.5 The function
scanl

The function
scanl f e
applies
foldl f e
to each initial segment of a list. For example
ghci> scanl (+) 0 [1..10]

[0,1,3,6,10,15,21,28,36,45,55]

The expression computes the
running sums
of the first ten positive numbers:
[0, 0+1, (0+1)+2, ((0+1)+2)+3, (((0+1)+2)+3)+4, ...]

The specification of
scanl
is

scanl :: (b -> a -> b) -> b -> [a] -> [b]

scanl f e = map (foldl f e) . inits

 

inits :: [a] -> [[a]]

inits [] = [[]]

inits (x:xs) = [] : map (x:) (inits xs)

For example

ghci> inits "barbara"

["","b","ba","bar","barb","barba","barbar","barbara"]

The function
inits
is in the library
Data.List
.

But this definition of
scanl f
involves evaluating
f
a total of 0 + 1 + 2 + · · · +
n
=
n
(
n
+ 1)/2

times on a list of length
n
. Can we do better?

Yes, we can calculate a better definition by doing a kind of induction proof, except that we don’t know what it is we are proving!

Case
[]

scanl f e []

=
{definition}

map (foldl f e) (inits [])

=
{
inits.1
}

map (foldl f e) [[]]

=
{
map.1
and
map.2
}

[foldl f e []]

=
{
foldl.1
}

[e]

Hence we have shown that
scanl f e [] = [e]

Case
x:xs

scanl f e (x:xs)

=
{definition}

map (foldl f e) (inits (x:xs))

=
{
inits.2
}

map (foldl f e) ([]:map (x:) (inits xs))

=
{
map.1
and
map.2
}

foldl f e []:map (foldl f e . (x:)) (inits xs)

=
{
foldl.1
}

e:map (foldl f e . (x:)) (inits xs)

=
{claim:
foldl f e . (x:) = foldl f (f e x)
}

e:map (foldl f (f e x)) (inits xs)

=
{definition of
scanl
}

e:scanl f (f e x)

The claim is an easy consequence of the definition of
foldl
. Hence, in summary, we have shown
scanl f e []
= [e]

scanl f e (x:xs) = e:scanl f (f e x) xs

This definition evaluates
f
only a linear number of times.

What we have just done is an example of optimising a function by
program calculation
. One of the exciting things about Haskell is that you can do this without fuss. There is no need to bring in a totally different logical language to reason about programs.

However, the prelude definition of
scanl
is a little different:
scanl f e xs = e : (case xs of

[] -> []

x:xs -> scanl f (f e x) xs)

Whereas for our version
scanl f e undefined = undefined
, the prelude version has
scanl f e undefined = e:undefined.

The reason is that the right-hand sides of the two clauses defining
scanl
are both lists that begin with
e
. We do not have to know anything about the left-hand sides to determine this fact, and laziness dictates that we don’t ask.

The prelude version also uses a
case
expression. We won’t go into details since such expressions are used rarely in this book. Haskell allows us many ways to say the same thing.

6.6 The maximum segment sum

Here is another example of program calculation. The
maximum segment sum
problem is a famous one and its history is described in J. Bentley’s
Programming Pearls
(1987). Given is a sequence of integers and it is required to compute the maximum of the sums of all
segments
in the sequence. A segment is also called a
contiguous subsequence
. For example, the sequence
[-1,2,-3,5,-2,1,3,-2,-2,-3,6]

has maximum sum 7, the sum of the segment
[5,-2,1,3]
. On the other hand, the sequence
[-1,-2,-3]
has a maximum segment sum of zero, since the empty sequence is a segment of every list and its sum is zero. It follows that the maximum segment sum is always nonnegative.

Our problem is specified by

mss :: [Int] -> Int

mss = maximum . map sum . segments

where
segments
returns a list of all segments of a list. This function can be defined in a number of ways, including
segments = concat . map inits . tails

Other books

Family Affair by Saxon Bennett
Jane and His Lordship's Legacy by Stephanie Barron
The Stolen Valentine by Emrick, K.J.
A Companion for Life by Cari Hislop
Commanded by Stacey Kennedy
Graven Image by Williams, Charlie