Read Thinking Functionally with Haskell Online
Authors: Richard Bird
The claim is that
flattenl . nestl i = flattenl
.
Answer to Exercise G
We can define
lines xs = if null zs then [ys]
else ys:lines (tail zs)
where (ys,zs) = break (=='\n') xs
The function
groupy
is defined by
groupy :: [[Int]] -> [[Int]]
groupy (xs:xss) = [sum xs + length xs - 1]:xs:xss
The function
nesty
is defined by
nesty :: :: Int -> [[Int]] -> [[Int]]
nesty i = map (add i)
where add i (x:xs) = x:[i+x | x <- xs]
The function
(<+>)
is defined by
(<+>) :: [[Int]] -> [[Int]] -> [[Int]]
xss <+> yss = [glue xs ys | xs <- xss, ys <- yss]
where glue xs ys = init xs ++ [last xs + head ys] ++
tail ys
Answer to Exercise H
One possibility, which no doubt can be improved on:
doc :: Doc -> Doc | |
doc Nil | = text "Nil" |
doc Line | = text "Line" |
doc (Text s) | = text ("Text " ++ show s) |
doc (Nest i x) | = text ("Nest " ++ show i) <> |
group (nest 2 (line <> paren (doc x))) | |
doc (x :<>: y) | = doc x <> text " :<>:" <> |
group (line <> nest 3 (doc y)) | |
doc (Group x) | = text "Group " <> |
group (nest 2 (line <> paren (doc x))) | |
paren x | = text "(" <> nest 1 x <> text ")" |
Answer to Exercise I
No. Consider a paragraph whose longest word is one character longer than the line width. In this case,
prettybad
will lay out each word on a single line, while
pretty
will still fill lines with groups of words provided they fit. For example:
ghci> putStrLn $ pretty 11 $ para pg4
A lost and
lonely
hippopotamus
went into a
bar.
Answer to Exercise J
First we show
layouts x = layr [(0,x)]
:
layr [(0,x)]
=
{definition of
layr
}
layouts (toDoc [(0,x)])
=
{definition of
toDoc
}
layouts (Nest 0 x :<>: Nil)
=
{laws of
Doc
}
layouts x
It remains to give a recursive definition of
layr
. We will just give two clauses:
toDoc ((i,Nest j x):ids)
=
{definition of
toDoc
}
Nest i (Nest j x) :<>: toDoc ids
=
{laws}
Nest (i+j) x :<>: toDoc ids
=
{definition of
toDoc
}
toDoc ((i+j x):ids)
Hence
layr ((i,Nest j x):ids) = layr ((i+j x):ids)
. Next:
toDoc ((i,x:<>:y):ids)
=
{definition of
toDoc
}
Nest i (x :<>: y) <> toDoc ids
=
{laws}
Nest i x :<>: Nest i y :<>: toDoc ids
=
{definition of
toDoc
}
toDoc ((i,x):(i,y):ids)
Hence
layr ((i,x:<>:y):ids) = layr ((i,x):(i,y):ids)
.
Answer to Exercise K
ghci> layout $ pretty 5 $ para pg
Hello
World1*** Exception: Prelude.undefined
ghci> layout $ pretty 10 $ cexpr ce
if happy
then great
else *** Exception: Prelude.undefined
Answer to Exercise L
The definition is
prettyl :: Int -> Doc -> Layout | |
prettyl w x = best w [(0,x)] | |
where | |
best r [] | = Empty |
best r ((i,Nil):ids) | = best r ids |
best r ((i,Line):ids) | = Break i (best (w-i) ids) |
best r ((i,Text s):ids) | = String s (best (r-length s) ids) |
best r ((i,Nest j x):ids) | = best r ((i+j,x):ids) |
best r ((i,x :<>: y):ids) | = best r ((i,x):(i,y):ids) |
best r ((i,Group x):ids) | = better r |
(best r ((i,flatten x):ids)) | |
(best r ((i,x):ids)) |
where
better
is changed to read
better r lx ly = if fits r (layout lx) then lx else ly
The number of steps required to evaluate
better r
is proportional to
r
and thus at most
w
. Now,
prettyl
takes linear time if
best
does. The second argument of
best
is a list of indentation-document pairs, and we can define the size of this list by
isize ids = sum [size x | (i,x) <- ids]
For each of the inner five clauses in the definition of
best
, the size decreases by 1. For instance
isize ((i,x :<>: y):ids)
= size (x :<> y) + isize ids
= 1 + size x + size y + isize ids
= 1 + isize ((i,x):(i,y):ids)
It follows that if we let
T
(
s
) denote the running time of
best r
on an input of size
s
, then
T
(0) =Θ(1) from the first clause of
best
, and
T
(
s
+1) =Θ(1) +
T
(
s
) for each of the five inner clauses, and
T
(
s
+1) =Θ(
w
) +
maximum
[
T
(
k
) +
T
(
s
−
k
)|
k ←
[1 ..
s
−1]]
for the last clause. And now we can deduce that
T
(
s
) =Θ(
ws
).
In conclusion, our algorithm for
pretty
is linear, though not independently of
w
.
We referred to pretty-printing as a library, but another name for it is an
embedded domain specific language
(EDSL). It is a language for pretty-printing documents embedded in the host language Haskell. Many people believe that the growing success of Haskell is due to its ability to host a variety of EDSLs without fuss.
The detailed material in this chapter has been based closely on work by Philip Wadler, see ‘A prettier printer’,
Chapter 11
in
The Fun of Programming
in
Cornerstones of Computing Series
(Palgrave MacMillan, 2003). The main difference is that Wadler used an explicit alternation operator in the term representation of
Doc
(though it was hidden from the user) rather than the constructor
Group
. Jeremy Gibbons suggested that the latter was a better fit with the idea of a deep embedding.
An earlier functional pretty-printing library based on a different set of combinators was described by John Hughes, ‘The design of a pretty-printer library’, in Johan Jeuring and Erik Meijer, editors,
Advanced Functional Programming
, volume 925 of
LNCS
, Springer, 1995. Hughes’ library was later reworked by Simon Peyton Jones and installed as a Haskell library
Text.PrettyPrint.HughesPJ
Another pretty-printing library, in an imperative rather than functional style, was constructed 30 years ago by Derek Oppen, ‘Pretty-printing’.
ACM Transactions on Programming Languages and Systems
2(4), 465–483, 1980 and is widely used as the basis of pretty-printing facilities in a number of languages. More recently, efficient pretty-printing algorithms in a functional style have been described by Olaf Chitil, ‘Pretty printing with lazy dequeues’,
ACM Transactions on Programming Languages and Systems
27(1),163–184, 2005, and by Olaf Chitil and Doaitse Swierstra, ‘Linear, bounded, functional pretty-printing’,
Journal of Functional Programming
19(1), 1–16, 2009. These algorithms are considerably more complicated than the one described in the text.
We have already met infinite lists in
Chapter 4
and even given an induction principle for reasoning about them in
Chapter 6
. But we haven’t really appreciated what can be done with them. In this chapter we want to explain in more detail exactly what an infinite list is, and how they can be represented by
cyclic
structures. We also describe another useful method for reasoning about infinite lists, and discuss a number of intriguing examples in which infinite and cyclic lists can be used to good effect.
Recall that
[m..]
denotes the infinite list of all integers from
m
onwards:
ghci> [1..]
[1,2,3,4,5,6,7,{Interrupted}
ghci> zip [1..] "hallo"
[(1,'h'),(2,'a'),(3,'l'),(4,'l'),(5,'o')]
It would take forever to print
[1..]
, so we interrupt the first computation. The second example illustrates a simple but typical use of infinite lists in finite computations.
In Haskell, the arithmetic expression
[m..]
is translated into
enumFrom m
, where
enumFrom
is a method in the
Enum
class, and defined by
enumFrom :: Integer -> [Integer]
enumFrom m = m:enumFrom (m+1)
Thus
[m..]
is defined as an instance of a recursively defined function. The computation makes progress because
(:)
is non-strict in its second argument.
It is important to bear in mind that infinite lists in computing do not have the same properties as infinite sets do in mathematics. For example, in set theory {
x
|
x
∈{1, 2, 3, . . .},
x
2
< 10}
denotes the set {1, 2, 3}, but
ghci> [x | x <- [1..], x*x < 10]
[1,2,3
After printing the first three values the computer gets stuck in an infinite loop looking for the next number after 3 whose square is less than 10. The value of the expression above is the partial list
1:2:3:undefined
.
It is possible to have an infinite list of infinite lists. For example,
multiples = [map (n*) [1..] | n <- [2..]]
defines an infinite list of infinite lists of numbers, the first three being
[2,4,6,8,...] [3,6,9,12,...] [4,8,12,16,...]
Suppose we ask whether the above list of lists can be merged back into a single list, namely
[2..]
. We can certainly merge two infinite lists:
merge :: Ord a => [a] -> [a] -> [a]
merge (x:xs) (y:ys) | x
| x==y = x:merge xs ys
| x>y = y:merge (x:xs) ys
This version of
merge
removes duplicates: if the two arguments are in strictly increasing order, so is the result. Note the absence of any clauses of
merge
mentioning the empty list. Now it seems that, if we define
mergeAll = foldr1 merge
then
mergeAll multiples
will return the infinite list
[2..]
. But it doesn’t. What happens is that the computer gets stuck in an infinite loop attempting to compute the first element of the result, namely
minimum (map head multiples)
It is simply not possible to compute the minimum element in an infinite list. Instead we have to make use of the fact that
map head multiples
is in strictly increasing order, and define
mergeAll = foldr1 xmerge
xmerge (x:xs) ys = x:merge xs ys
With this definition,
mergeAll multiples
does indeed return
[2..]
.
Finally, recall the induction principle described in
Chapter 6
for proving facts about infinite lists. Provided
P
is a chain-complete assertion, we can prove that
P
(
xs
) holds for all infinite lists
xs
by showing that: (i)
P
(
undefined
) holds; and (ii)
P
(
xs
) implies
P
(
x:xs
) for all
x
and
xs
. Using this principle, we proved in
Chapter 6
that
xs++ys
=
xs
for all infinite lists
xs
. But it’s not immediately clear how induction can be used to prove, for example,
map fact [0..] = scanl (*) 1 [1..]
The obvious result to prove is
map fact [0..n] = scanl (*) 1 [1..n]
for all
n
, but can one then assert the first identity holds?
Data structures, like functions, can be defined recursively. For instance
ones :: [Int]
ones = 1:ones
This is an example of a
cyclic
list, a list whose definition is recursive. Contrast this definition with
ones = repeat 1
, where
repeat x = x:repeat x