Read Thinking Functionally with Haskell Online
Authors: Richard Bird
The definition of
T
requires some amplification. Firstly,
T
(
f
) refers to the complexity of a given
definition
of
f
. Time complexity is a property of an expression, not of the value of that expression.
Secondly, the number of reduction steps does not correspond exactly to the elapsed time between submitting an expression for evaluation and waiting for the answer. No account is taken of the time to find the next subexpression to be reduced in a possibly large and complicated expression. For this reason the statistics facility of GHCi does not count reduction steps, but produces a measure of elapsed time.
Thirdly, we do not formalise the notion of size, since different measures are appropriate in different situations. For example, the cost of evaluating
xs++ys
is best measured in terms of (
m
,
n
), a pair describing the lengths of the two lists. In the case of
concat xss
we could take the length of
concat xss
as a measure of size, but if
xss
is a list of length
m
consisting of lists all of length
n
, then (
m
,
n
) might be a more suitable measure.
The fourth and crucial remark is that
T
(
f
)(
n
) is determined under an
eager
evaluation model of reduction. The reason is simply that estimating the number of
reduction steps under lazy evaluation is difficult. To illustrate, consider the definition
minimum = head . sort
. Under eager evaluation, the time to evaluate the minimum on a list of length
n
under this definition is given by
T
(
minimum
)(
n
) =
T
(
sort
)(
n
) +
T
(
head
)(
n
).
In other words we first have to completely sort a list of length
n
and then take the head of the result (presumably a constant-time operation). This equation does not hold under lazy evaluation, since the number of reduction steps required to find the head of
sort xs
requires only that
sort xs
be reduced to head normal form. How long that takes depends on the precise algorithm used for
sort
. Timing analysis under eager reduction is simpler because it is
compositional
. Since lazy evaluation never requires more reduction steps than eager evaluation, any upper bound for
T
(
f
)(
n
) will also be an upper bound under lazy evaluation. Furthermore, in many cases of interest, a lower bound will also be a lower bound under lazy evaluation.
In order to give some examples of timing analyses we have to introduce a little order notation. So far, we have used the awkward phrase ‘taking a number of steps proportional to’ whenever efficiency is discussed. It is time to replace it by something shorter. Given two functions
f
and
g
on the natural numbers, we say that
f
is of order
g
, and write
f
= Θ(
g
) if there are positive constants
C
1
and
C
2
and a natural number
n
0
such that
C
1
g
(
n
) ≤
f
(
n
) ≤
C
2
g
(
n
) for all
n
>
n
0
. In other words,
f
is bounded above and below by some constant times
g
for all sufficiently large arguments.
The notation is abused to the extent that one conventionally writes, for example,
f
(
n
) = Θ(
n
2
) rather than the more correct
f
= Θ(λ
n
.
n
2
). Similarly, one writes
f
(
n
) = Θ(
n
) rather than
f
= Θ(
id
). The main use of Θ-notation is to hide constants; for example, we can write
without bothering about the exact constants involved. When Θ(
g
) appears in a formula it stands for some unnamed function
f
satisfying
f
= Θ(
g
). In particular, Θ(1) denotes an anonymous constant.
With that behind us, we give three examples of how to analyse the running time of a computation. Consider first the following two definitions of
concat
:
concat xss
= foldr (++) [] xss
concat' xss = foldl (++) [] xss
The two definitions are equivalent provided
xss
is a finite list. Suppose
xss
is a
list of length
m
of lists all of length
n
. Then the first definition gives
T
(
concat
)(
m
,
n
) =
T
(
foldr (++) []
)(
m
,
n
),
T
(
foldr (++) []
)(0,
n
) = Θ(1),
T
(
foldr (++) []
)(
m
+1,
n
) =
T
(
++
)(
n
,
mn
) +
T
(
foldr (++) []
)(
m
,
n
).
The estimate
T
(
++
)(
n
,
mn
) arises because a list of length
n
is concatenated with a list of length
mn
. Since
T
(
++
)(
n
,
m
) = Θ(
n
), we obtain
For the second definition of
concat
we have
T
(
concat'
)(
m
,
n
) =
T
(
foldl (++)
)(0,
m
,
n
),
T
(
foldl (++)
)(
k
, 0,
n
) =
O
(1),
T
(
foldl (++)
)(
k
,
m
+1,
n
) =
T
(
++
)(
k
,
n
) +
T
(
foldl (++)
)(
k
+
n
,
m
,
n
).
The additional argument
k
refers to the length of the accumulated list in the second argument of
foldl
. This time we obtain
Hence
T
(
concat'
)(
m
,
n
) = Θ(
m
2
n
). The conclusion, which was anticipated in the previous chapter, is that using
foldr
rather than
foldl
in the definition of
concat
leads to an asymptotically faster program.
For the second example let us time the two programs for
subseqs
discussed in
Section 7.1
, where we had either of the following two possibilities:
subseqs (x:xs)
= subseqs xs ++ map (x:) (subseqs xs)
subseqs' (x:xs) = xss ++ map (x:) xss
where xss = subseqs' xs
Bearing in mind that (i) if
xs
has length
n
, then
subseqs xs
has length 2
n
; and (ii) the time for both the concatenation and for applying
map (x:)
is therefore Θ(2
n
), the two timing analyses give
T
(
subseqs
)(
n
+1) = 2
T
(
subseqs
)(
n
) +Θ(2
n
),
T
(
subseqs'
)(
n
+1) =
T
(
subseqs'
)(
n
) +Θ(2
n
)
together with
T
(
subseqs
)(0) = Θ(1). We will just state the two solutions (which can be proved by a simple induction argument):
T
(
subseqs
)(
n
) = Θ(
n
2
n
),
T
(
subseqs'
)(
n
) = Θ(2
n
).
The latter is therefore asymptotically faster than the former by a logarithmic factor.
For the third example, let us time the two programs for
cp
discussed at the beginning of this section. The first one was
cp []
= [[]]
cp (xs:xss) = [x:ys | x <- xs, ys <- cp xss]
Suppose once again that
xss
is a list of length
m
of lists all of length
n
. Then the length of
cp xss
is
n
m
. Then we have
T
(
cp
)(0,
n
) = Θ(1),
T
(
cp
)(
m
+1,
n
) =
nT
(
cp
)(
m
,
n
) +Θ(
n
m
).
because it takes Θ(
n
m
) steps to apply
(x:)
to every subsequence. The solution is
T
(
cp
)(
m
,
n
) = Θ(
mn
m
).
On the other hand, the definition of
cp
in terms for
foldr
gives
T
(
cp
)(0,
n
) = Θ(1),
T
(
cp
)(
m
+1,
n
) =
T
(
cp
)(
m
,
n
) +Θ(
n
m
).
with solution
T
(
cp
)(
m
,
n
) = Θ(
n
m
). The second version is therefore asymptotically faster, again by a logarithmic factor.
Sometimes we can improve the running time of a computation by adding an extra argument, called an
accumulating parameter
, to a function. The canonical example is the function
reverse
:
reverse []
= []
reverse (x:xs) = reverse xs ++ [x]
With this definition we have
T
(
reverse
)(
n
) = Θ(
n
2
). In search of a linear-time program, suppose we define
revcat :: [a] -> [a] -> [a]
revcat xs ys = reverse xs ++ ys
It is clear that
reverse xs
=
revcat xs []
, so if we can obtain an efficient version of
revcat
we can obtain an efficient version of
reverse
. To this end we calculate a recursive definition of
revcat
. The base case
revcat [] ys = ys
is left as an exercise, and the inductive case is as follows:
revcat (x:xs) ys