Read Thinking Functionally with Haskell Online
Authors: Richard Bird
The two facts have a name: they are called the
functor
laws of
map
. The name is borrowed from a branch of mathematics called Category Theory. In fact, Haskell provides a type class
Functor
, whose definition is
class Functor f where
fmap :: (a -> b) -> f a -> f b
The method
fmap
is expected to satisfy exactly the same laws as
map
. The reason for this type class is that the idea of mapping a function over a list can be generalised to one of mapping a function over an arbitrary data structure, such as trees of various kinds. For example, consider the type
data Tree a = Tip a | Fork (Tree a) (Tree a)
of binary trees with labels in their tips. Tree-structured data arise in a number of places, for example with the syntax of expressions of various kinds. We can define a mapping function over trees, but rather than calling it
mapTree
we can call it
fmap
by making trees a member of the
Functor
class:
instance Functor Tree where
fmap f (Tip x) = Tip (f x)
fmap f (Fork u v) = Fork (fmap f u) (fmap f v)
In fact
map
is just a synonym for the instance
fmap
for lists:
ghci> fmap (+1) [2,3,4]
[3,4,5]
We mention the
Functor
type class here primarily to show that if ever you think some function on lists can be usefully generalised to other kinds of data structure, the chances are good that the designers of Haskell have already spotted it and introduced an appropriate type class. As we will see later on, and especially in
Chapter 12
, the functor laws of
map
appear in many calculations.
There is another group of laws that involve
map
, all of which have a common theme. Consider the equations
f . head
= head . map f
map f . tail
= tail . map f
map f . concat = concat . map (map f)
The first equation holds only if
f
is a strict function, but the others hold for arbitrary
f
. If we apply both sides of the equation to the empty list, we get
f (head []) = head (map f []) = head []
Since the head of an empty list is undefined, we require
f
to be strict to make the equation true.
Each of the laws has a simple interpretation. In each case you can apply the operation (
head
,
tail
, and so on) to a list and then change each element, or you can change each element first and then apply the operation. The common theme lies in the types of the operations involved:
head
:: [a] -> a
tail
:: [a] -> [a]
concat :: [[a]] -> [a]
The point about the operations is that they do not depend in any way on the nature of the list elements; they are simply functions that shuffle, discard or extract elements from lists. That is why they have polymorphic types. And functions with polymorphic types all satisfy some law that says you can change values before or after applying the function. In mathematics such functions are called
natural transformations
and the associated laws,
naturality
laws.
As another example, since
reverse :: [a] -> [a]
we would expect that
map f . reverse = reverse . map f
Indeed this is the case. Of course, this naturality law still has to be proved. Another law is
concat . map concat = concat . concat
The two sides assert that two ways of concatenating a list of lists of lists (either do the inner concatenations first, or do the outer concatenations first) give the same result.
Finally, here is just one property of
filter
:
filter p . map f = map f . filter (p . f)
We can prove this law by simple equational reasoning:
filter p . map f
=
{second definition of
filter
}
concat . map (test p) . map f
=
{functor property of
map
}
concat . map (test p . f)
=
{since
test p . f = map f . test (p . f)
}
concat . map (map f . test (p . f))
=
{functor property of
map
}
concat . map (map f) . map (test (p . f))
=
{naturality of
concat
}
map f . concat . map (test (p . f))
=
{second definition of
filter
}
map f . filter (p . f)
Laws like those above are not just of academic interest, but are deployed in finding new and better ways of expressing definitions. That’s why functional programming is the best thing since sliced bread.
Finally, to complete a simple toolbox of useful operations, we consider the functions
zip
and
zipWith
. The definitions in the standard prelude are:
zip :: [a] -> [b] -> [(a,b)]
zip (x:xs) (y:ys) = (x,y): zip xs ys
zip
= []
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (x:xs) (y:ys) = f x y : zipWith f xs ys
zipWith f
= []
A caring programmer (one who doesn’t like ‘don’t care’ patterns) would have written
zip [] ys
= []
zip (x:xs) []
= []
zip (x:xs) (y:ys) = (x,y):zip xs ys
Both definitions use pattern matching on both arguments. You have to know that pattern matching is applied from top to bottom and from left to right. Thus
zip [] undefined = []
zip undefined [] = undefined
The definition of
zip
can be given another way:
zip = zipWith (,)
The operation
(,)
is a constructor for pairs:
(,) a b = (a,b)
.
Here is one example of the use of
zipWith
. Suppose we want to determine whether a list is in nondecreasing order. A direct definition would have:
nondec :: (Ord a) => [a] -> Bool
nondec []
= True
nondec [x]
= True
nondec (x:y:xs) = (x <= y) && nondec (y:xs)
But another, equivalent and shorter definition is
nondec xs = and (zipWith (<=) xs (tail xs))
The function
and
is yet another useful function in the standard prelude. It takes a list of booleans and returns
True
if all the elements are
True
, and
False
otherwise:
and :: [Bool] -> Bool
and []
= True
and (x:xs) = x && and xs
One final example. Consider the task of building a function
position
that takes a value
x
and a finite list
xs
and returns the first position in
xs
(counting positions from 0) at which
x
occurs. If
x
does not occur in the list, then −1 is returned. We can define
position :: (Eq a) => a -> [a] -> Int
position x xs
= head ([j | (j,y) <- zip [0..] xs, y==x] ++ [-1])
The expression
zip [0..] xs
pairs each element of
xs
with its position in
xs
. Although the first argument of
zip
is an infinite list, the result is a finite list whenever
xs
is. Observe that the problem is solved by first computing the list of
all
positions at which
x
is found, and then taking the first element. Under lazy evaluation it is not necessary to construct the value of every element of the list in order
to calculate the head of the list, so there is no great loss of efficiency in solving the problem this way. And there is a great deal of simplicity in defining one search result in terms of all search results.
Let’s now return to
Section 1.3
and complete the definition of
commonWords
. Recall that we finished with
commonWords :: Int -> [Char] -> [Char]
commonWords n = concat . map showRun . take n .
sortRuns . countRuns . sortWords .
words . map toLower
The only functions we have still to give definitions for are
showRun countRuns sortRuns sortWords
All the others, including
words
, are provided in the standard Haskell libraries.
The first one is easy:
showRun :: (Int,Word) -> [Char]
showRun (n,w) = w ++ ": " ++ show n ++ "\n"
The second one can be defined by
countRuns :: [Word] -> [(Int,Word)]
countRuns []
= []
countRuns (w:ws) = (1+length us,w):countRuns vs
where (us,vs) = span (==w) ws
The prelude function
span p
splits a list into two, the first being the longest prefix of the list all of whose elements satisfy the test
p
, and the second being the suffix that remains. Here is the definition:
span :: (a -> Bool) -> [a] -> ([a],[a])
span p []
= ([],[])
span p (x:xs) = if p x then (x:ys,zs)
else ([],x:xs)
where (ys,zs) = span p xs
That leaves
sortRuns
and
sortWords
. We can import the function
sort
from
Data.List
by the command
import Data.List (sort)
Since
sort :: (Ord a) => [a] -> [a]
we can then define
sortWords :: [Word] -> [Word]
sortWords = sort
sortRuns :: [(Int,Word)] -> [(Int,Word)]
sortRuns = reverse . sort
To understand the second definition you have to know that Haskell automatically defines the comparison operation
(<=)
on pairs by
(x1,y1) <= (x2,y2) = (x1 < x2) || (x1 == x2 && y1 <= y2)
You also have to know that
sort
sorts into ascending order. Since we want the codes in descending order of count, we just sort into ascending order and reverse the result. That, by the way, is why we defined frequency counts by having the count before the word rather than afterwards.
Instead of relying on the library function for sorting, let us end by programming a sorting function ourselves. One good way to sort is to use a
divide and conquer
strategy: if the list has length at most one then it is already sorted; otherwise we can divide the list into two equal halves, sort each half by using the sorting algorithm recursively, and then merge the two sorted halves together. That leads to
sort :: (Ord a) => [a] -> [a]
sort []
= []
sort [x] = [x]
sort xs
= merge (sort ys) (sort zs)
where (ys,zs) = halve xs
halve xs = (take n xs, drop n xs)
where n = length xs `div` 2
That leaves us with the definition of
merge
, which merges two sorted lists together into one sorted list:
merge :: (Ord a) => [a] -> [a] -> [a]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys)
| x <= y
= x:merge xs (y:ys)
| otherwise = y:merge (x:xs) ys
In fact, many Haskell programmers wouldn’t write the last clause of
merge
in quite this way. Instead they would write
merge xs'@(x:xs) ys'@(y:ys)
| x <= y
= x:merge xs ys'
| otherwise
= y:merge xs' ys
This definition uses an
as-pattern
. You can see the point: rather than deconstructing a list and then reconstructing it again (a cheap but not free operation), it is better to reuse the value that we matched with. True, but it does obscure a simple mathematical equation, and we will use such patterns only very sparingly in this book.
Both
sort
and
merge
are defined recursively and it is worthwhile pointing out why the two recursions terminate. In the case of
merge
you have to see that one or other of the two arguments of
merge
decreases in size at each recursive call. Hence one of the base cases will eventually be reached. In the case of
sort
the critical observation is that if
xs
has length at least two, then both
ys
and
zs
have length strictly less than
xs
, and the same argument applies. But see what happens if we had omitted the clause
sort [x] = [x]
. Since 1 div 2 = 0 we would have,
sort [x] = merge (sort []) (sort [x])
That means evaluation of
sort [x]
requires evaluation of
sort [x]
, and the whole definition of
sort
spins off into an infinite loop for nonempty arguments. Checking that you have all the necessary base cases is one of the most important parts of constructing a recursive function.