Thinking Functionally with Haskell (44 page)

Read Thinking Functionally with Haskell Online

Authors: Richard Bird

BOOK: Thinking Functionally with Haskell

10.21Mb size Format: txt, pdf, ePub

Read Book Download Book

guess w ws}}

Finally we program
match
:

match w' w = map check w

where

check x = if x `elem` w' then x else '-'

Answer to Exercise K

The following program is correct but doesn’t run in constant space:

fib n = fst $ runST (fibST n)

fibST :: Int -> ST s (Integer,Integer)

fibST n = do {ab <- newSTRef (0,1);

repeatFor n

(do {(a,b) <- readSTRef ab;

writeSTRef ab $! (b,a+b)});

readSTRef ab}

The reason is that
(b,a+b)
is already in head-normal form, so strict-apply has no effect. The penultimate line needs to be changed to
b `seq` (a+b) `seq` writeSTRef ab (b,a+b)

in order to force evaluation of the components.

Answer to Exercise L

The version that uses the
State
monad:

gcd (x,y) = fst $ runState loop (x,y)

loop :: State (Int,Int) Int

loop = do {(x,y) <- get;

if x == y

then return x

else if x < y

then do {put (x,y-x); loop}

else do {put (x-y,y); loop}}

The version that uses the
ST
monad:

gcd (x,y) = runST $

do {a <- newSTRef x;

b <- newSTRef y;

loop a b}

loop :: STRef s Int -> STRef s Int -> ST s Int

loop a b

= do {x <- readSTRef a;

y <- readSTRef b;

if x==y

then return x

else if x

then do {writeSTRef b (y-x);loop a b}

else do {writeSTRef a (x-y);loop a b}}

Answer to Exercise M

There are, of course, many possible answers. The one I chose was to represent the array of tiles by a list of nine digits [0 .. 8] with zero representing the space. To avoid recalculation, a position is represented by a pair (
j
,
ks
) with
j
as the position of the zero in
ks
, where
ks
was some permutation of [0 .. 8]. Thus:

type Position = (Int,[Int])

data Move
= Up | Down | Left | Right

encode :: Position -> Integer

encode (j,ks) = foldl op 0 ks

where op x d = 10*x + fromIntegral d

start :: Position

start = (0,[0..8])

The function
moves
can be defined by

moves :: Position -> [Move]

moves (j,ks)

= [Up
| j `notElem` [6,7,8]] ++

[Down
| j `notElem` [0,1,2]] ++

[Left
| j `notElem` [2,5,8]] ++

[Right | j `notElem` [0,3,6]]

Up moves are allowed except for a blank in the bottom row; down moves except for a blank in the top row, left moves except for a blank in the rightmost column, and right moves except for a blank in the leftmost column.

The function
move
can be defined by:

move :: Position -> Move -> Position

move (j,ks) Up
= (j+3,swap (j,j+3) ks)

move (j,ks) Down = (j-3,swap (j-3,j) ks)

move (j,ks) Left = (j+1,swap (j,j+1) ks)

move (j,ks) Right = (j-1,swap (j-1,j) ks)

swap (j,k) ks = ks1 ++ y:ks3 ++ x:ks4

where (ks1,x:ks2) = splitAt j ks

(ks3,y:ks4) = splitAt (k-j-1) ks2

Finally,

solved :: Position -> Bool

solved p = p == (8,[1,2,3,4,5,6,7,8,0])

My computer produced:

ghci> solve start

Just [Left,Up,Right,Up,Left,Left,Down,

Right,Right,Up,Left,Down,Down,Left,

Up,Up,Right,Right,Down,Left,Left,Up]

(4.84 secs, 599740496 bytes)

10.9 Chapter notes

Read
The History of Haskell
to see how monads came to be an integral part of Haskell, and why this idea has been mainly responsible for the increasing use of Haskell in the real world. Monads are used to structure GHC, which itself is written in Haskell. Each phase of the compiler uses a monad for book-keeping information. For instance, the type checker uses a monad that combines state (to maintain a current substitution), a name supply (for fresh type variable names) and exceptions.

Use of
do
-notation in preference to
(>>=)
was suggested by John Launchbury in 1993 and was first implemented by Mark Jones in Gofer.

The number of tutorials on monads has increased steadily over the years; see

haskell.org/haskellwiki/Monad_tutorials

for a reasonably comprehensive list.

The example (in Exercise F) of monadic equational reasoning can be found in the paper ‘Unifying theories of programming with monads’, (UTP Symposium, August 2012) by Jeremy Gibbons. For additional material on reasoning equationally with monads, read ‘Just
do
it: simple monadic equational reasoning’ by Jeremy Gibbons and Ralf Hinze, which appeared in the proceedings of the 2011 International Conference of Functional Programming. Both papers can be found at
www.cs.ox.ac.uk/people/jeremy.gibbons/publications/

¹Actually there is, and it’s called
unsafePerformIO
, but it is a very unsafe function.

Chapter 11
Parsing

A
parser
is a function that analyses a piece of text to determine its logical structure. The text is a string of characters describing some value of interest, such as an arithmetic expression, a poem or a spreadsheet. The output of a parser is a representation of the value, such as a tree of some kind for an arithmetic expression, a list of verses for a poem, or something more complicated for a spreadsheet. Most programming tasks involve decoding the input in some way, so parsing is a pervasive component of computer programming. In this chapter we will describe a monadic approach to parsing, mainly designing simple parsers for expressions of various kinds. We will also say a little more about the converse process of encoding the output as a string; in other words, more about the type class
Show
. This material will be used in the final chapter.

11.1 Parsers as monads

Parsers return different values of interest, so as a first cut we can think of a parser as a function that takes a string and returns a value:
type Parser a = String -> a

This type is basically the same as that of the standard prelude function
read :: Read a => String -> a

Indeed,
read
is a parser, though not a very flexible one. One reason is that all the input must be consumed. Thus:
ghci> read "123" :: Int 123

ghci> read "123+51" :: Int

*** Exception: Prelude.read: no parse

With
read
there is no obvious way of reading two or more things in sequence. For example, in a parser for arithmetic expressions we may want to look in the input stream for a numeral, then an operator and then another numeral. The first parser for a numeral will consume some prefix of the input, the parser for an operator some prefix of the remaining input, and the third parser yet more input. A better idea is to define a parser as a function that consumes a prefix of the input and returns both a value of interest and the unconsumed suffix:
type Parser a = String -> (a,String)

We are not quite there yet. It can happen that a parser may
fail
on some input. It is not a mistake to construct parsers that can fail. For example, in a parser for arithmetic expressions, we may want to look for either a numeral or an opening parenthesis. One or either of these subsidiary parsers will certainly fail. Failure should not be thought of as an error that terminates the parsing process; rather it acts like an identity element for an operation that chooses between alternatives. More generally, a parser may find a number of different ways that some prefix of the input can be structured. Failure then corresponds to the particular case of the empty sequence of parses. In order to handle these various possibilities, we change our definition yet again and define
type Parser a = String -> [(a,String)]

The standard prelude provides exactly this type synonym, except that it is called
ReadS
, not
Parser
. And it also provides a function
reads :: Read a => ReadS a

as a subsidiary method in the type class
Read
. For example,
ghci> reads "-123+51" :: [(Int,String)]

[(-123,"+51")]

ghci> reads "+51" :: [(Int,String)]

[]

As with the function
read
you have to tell
reads
the type you are expecting. The second example fails, returning no parses, because a Haskell integer can be preceded by an optional minus sign but not by an optional plus sign. By definition, a parser is
deterministic
if it returns an empty or singleton list of parses in all possible cases. In particular, instances of
reads
ought to be deterministic parsers.

There is one further change we have to make to the definition of
Parser
. We would like to install this type as an instance of the
Monad
class, but that is not possible. The reason is that
Parser
is declared as a type synonym, and type synonyms cannot be made members of any type class: they inherit whatever instances are declared for the underlying type. A type synonym is there simply to improve readability in type declarations; no new types are involved and we cannot construct two different type class instances for what is essentially the same type.

One way to construct a new type is by a data declaration:

data Parser a = Parser (String -> [(a,String)])

The identifier
Parser
on the right is a constructor, while on the left it is the name of a new type. Most people are happy with the pun; others would rename the constructor as something like
MkParser
or just
P
.

There is a better way to create a new type for
Parser
and that is to use a
newtype
declaration:
newtype Parser a = Parser (String -> [(a,String)])
We have not needed
newtype
declarations up to now, so let us digress a little to explain them. The price paid for using a
data
declaration for
Parser
is that operations to examine parsers have to be constantly unwrapped and rewrapped with the constructor
Parser
, and this adds to the running time of parser operations. In addition there is an unwanted element of
Parser
, namely
Parser undefined
. In other words,
Parser a
and
String -> [(a,String)]
are not
isomorphic
types. Recognising this, Haskell allows a
newtype
declaration for types defined with a
single
constructor taking a
single
argument. It differs from a type synonym in that it creates a genuinely new type whose values must be expressed using the
Parser
wrapper. But these coercions, though they have to appear in the program text, do not add to the execution time of the program because the Haskell compiler eliminates them before evaluation begins. The values of the new type are systematically replaced by the values in the underlying type. Consequently,
Parser a
and
String -> [(a,String)]
describe isomorphic types, and
Parser undefined
and
undefined
are isomorphic values sharing the same representation. New types, as distinct from synonym types, can be made members of type classes in different ways from the underlying type.

With either kind of declaration we have to provide some way of applying the parsing function, so we define
apply :: Parser a -> String -> [(a,String)]

apply (Parser p) s = p s

The functions
apply
and
Parser
are mutual inverses and witness the isomorphism.

We also define

parse :: Parser a -> String -> a

parse p = fst . head . apply p

The function
parse p
returns the first object of the first parse, causing an error if the parser
p
fails. This is the only place an error might occur.

Now we can define

instance Monad Parser where

return x = Parser (\s -> [(x,s)])

p >>= q = Parser (\s -> [(y,s'')

| (x,s') <- apply p s,

(y,s'') <- apply (q x) s'])

In the definition of
p >>= q
the parser
p
is applied to an input string, producing a list of possible parses each of which is paired with the corresponding unconsumed portion of the input. The parser
q
is then applied to each parse to produce a list of results whose concatenation provides the final answer. One should also show that the three monad laws hold, a task we will leave as an exercise.

11.2 Basic parsers

Perhaps the simplest basic parser is

getc :: Parser Char

Other books

Moscow Rules by Daniel Silva

Breakdown: Season One by Jordon Quattlebaum

Sexual Healin': A BWWM Billionaire Romance by Sherie Keys

Hider/Seeker

Shelf Monkey by Corey Redekop

The Secret Life of Lady Lucinda by Sophie Barnes

Golden Lies by Barbara Freethy

Game Change: A Nina Bannister Mystery (The Nina Bannister Mysteries Book 3) by T'Gracie Reese, Joe Reese

Stanley Kubrick's A clockwork orange: based on the novel by Anthony Burgess by Stanley Kubrick; Anthony Burgess

Running the Maze by Jack Coughlin, Donald A. Davis