Read The Language Instinct: How the Mind Creates Language Online
Authors: Steven Pinker
Onion sentences show that a grammar and a parser are different things. A person can implicitly “know” constructions that he or she can never understand, in the same way that Alice knew addition despite the Red Queen’s judgment:
“Can you do addition?” the White Queen asked. “What’s one and one and one and one and one and one and one and one and one and one?”
“I don’t know,” said Alice. “I lost count.”
“She can’t do Addition,” the Red Queen interrupted.
Why does the human parser seem to lose count? Is there not enough room in short-term memory to hold more than one or two dangling phrases at a time? The problem must be more subtle. Some three-layer onion sentences are a little hard because of the memory load but are not nearly as opaque as the
has has has
sentence:
The cheese that some rats I saw were trying to eat turned out to be rancid.
The policies that the students I know object to most strenuously are those pertaining to smoking.
The guy who is sitting between the table that I like and the empty chair just winked.
The woman who the janitor we just hired hit on is very pretty.
What boggles the human parser is not the amount of memory needed but the
kind
of memory: keeping a particular kind of phrase in memory, intending to get back to it, at the same time as it is analyzing another example of
that very same kind of phrase
. Examples of these “recursive” structures include a relative clause in the middle of the same kind of relative clause, or an
if…then
sentence inside another
if…then
sentence. It is as if the human sentence parser keeps track of where it is in a sentence not by writing down a list of currently incomplete phrases in the order in which they must be completed, but by writing a number in a slot next to each phrase type on a master checklist. When a type of phrase has to be remembered more than once—so that both it
(the cat that…)
and the identical type of phrase it is inside of
(the rat that…)
can be completed in order—there is not enough room on the checklist for both numbers to fit, and the phrases cannot be completed properly.
Unlike memory, which people are bad at and computers are good at, decision-making is something that people are good at and computers are bad at. I contrived the toy grammar and the baby sentence we have just walked through so that every word had a single dictionary entry (that is, was at the right-hand side of only one rule). But all you have to do is open up a dictionary, and you will see that many nouns have a secondary entry as a verb, and vice versa. For example,
dog
is listed a second time—as a verb, for sentences like
Scandals dogged the administration all year
. Similarly, in real life
hot dog
is not only a noun but a verb, meaning “to show off.” And each of the verbs in the toy grammar should also be listed as nouns, because English speakers can talk of cheap
eats
, his
likes
and dislikes, and taking a few
bites
. Even the determiner
one
, as in
one dog
, can have a second life as a noun, as in
Nixon’s the one
.
These local ambiguities present a parser with a bewildering number of forks at every step along the road. When it comes across, say, the word
one
at the beginning of a sentence, it cannot simply build
but must also keep in mind
Similarly, it has to jot down two rival branches when it comes across
dog
, one in case it is a noun, the other in case it is a verb. To handle
one dog
, it would need to check four possibilities: determiner-noun, determiner-verb, noun-noun, and noun-verb. Of course determiner-verb can be eliminated because no rule of grammar allows it, but it still must be checked.
It gets even worse when the words are grouped into phrases, because phrases can fit inside larger phrases in many different ways. Even in our toy grammar, a prepositional phrase (PP) can go inside either a noun phrase or a verb phrase—as in the ambiguous
discuss sex with Dick Cavett
, where the writer intended the PP
with Dick Cavett
to go inside the verb phrase (discuss it with him) but readers can interpret it as going inside the noun phrase (sex with him). These ambiguities are the rule, not the exception; there can be dozens or hundreds of possibilities to check at every point in a sentence. For example, after processing
The plastic pencil marks…
, the parser has to keep several options open: it can be a four-word noun phrase, as in
The plastic pencil marks were ugly
, or a three-word noun phrase plus a verb, as in
The plastic pencil marks easily
. In fact, even the first two words,
The plastic…
, are temporarily ambiguous: compare
The plastic rose fell
with
The plastic rose and fell
.
If it were just a matter of keeping track of all the possibilities at each point, a computer would have little trouble. It might churn away for minutes on a simple sentence, or use up so much short-term memory that the printout would spill halfway across the room, but eventually most of the possibilities at each decision point would be contradicted by later information in the sentence. If so, a single tree and its associated meaning should pop out at the end of the sentence, as in the toy example. When the local ambiguities fail to cancel each other out and two consistent trees are found for the same sentence, we should have a sentence that people find ambiguous, like
Ingres enjoyed painting his models nude.
My son has grown another foot.
Visiting relatives can be boring.
Vegetarians don’t know how good meat tastes.
I saw the man with the binoculars.
But here is the problem. Computer parsers are too meticulous for their own good. They find ambiguities that are quite legitimate, as far as English grammar is concerned, but that would never occur to a sane person. One of the first computer parsers, developed at Harvard in the 1960s, provides a famous example. The sentence
Time flies like an arrow
is surely unambiguous if there ever was an unambiguous sentence (ignoring the difference between literal and metaphorical meanings, which have nothing to do with syntax). But to the surprise of the programmers, the sharp-eyed computer found it to have five different trees!
Time proceeds as quickly as an arrow proceeds, (the intended reading)
Measure the speed of flies in the same way that you measure the speed of an arrow.
Measure the speed of flies in the same way that an arrow measures the speed of flies.
Measure the speed of flies that resemble an arrow.
Flies of a particular kind, time-flies, are fond of an arrow.
Among computer scientists the discovery has been summed up in the aphorism “Time flies like an arrow; fruit flies like a banana.” Or consider the song line
Mary had a little lamb
. Unambiguous? Imagine that the second line was:
With mint sauce
. Or:
And the doctors were surprised
. Or:
The tramp!
There is even structure in seemingly nonsensical lists of words. For example, this fiendish string devised by my student Annie Senghas is a grammatical sentence:
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
American bison are called
buffalo
. A kind of bison that comes from Buffalo, New York, could be called a
Buffalo buffalo
. Recall that there is a verb
to buffalo
that means “to overwhelm, to intimidate.” Imagine that New York State bison intimidate one another:
(The) Buffalo buffalo (that) Buffalo buffalo (often) buffalo (in turn) buffalo (other) Buffalo buffalo
. The psycholinguist and philosopher Jerry Fodor has observed that a Yale University football cheer
Bulldogs Bulldogs Bulldogs Fight Fight Fight!
is a grammatical sentence, albeit a triply center-embedded one.
How do people home in on the sensible analysis of a sentence, without tarrying over all the grammatically legitimate but bizarre alternatives? There are two possibilities. One is that our brains are like computer parsers, computing dozens of doomed tree fragments in the background, and the unlikely ones are somehow filtered out before they reach consciousness. The other is that the human parser somehow gambles at each step about the alternative most likely to be true and then plows ahead with that single interpretation as far as possible. Computer scientists call these alternatives “breadth-first search” and “depth-first search.”
At the level of individual words, it looks as if the brain does a breadth-first search, entertaining, however briefly, several entries for an ambiguous word, even unlikely ones. In an ingenious experiment, the psycholinguist David Swinney had people listen over headphones to passages like the following:
Rumor had it that, for years, the government building had been plagued with problems. The man was not surprised when he found several spiders, roaches, and other bugs in the corner of his room.
Did you notice that the last sentence contains an ambiguous word,
bug
, which can mean either “insect” or “surveillance device”? Probably not; the second meaning is more obscure and makes no sense in context. But psycholinguists are interested in mental processes that last only milliseconds and need a more subtle technique than just asking people. As soon as the word
bug
had been read from the tape, a computer flashed a word on a screen, and the person had to press a button as soon as he or she had recognized it. (Another button was available for nonwords like
blick
.) It is well known that when a person hears one word, any word related to it is easier to recognize, as if the mental dictionary is organized like a thesaurus, so that when one word is found, others similar in meaning are more readily available. As expected, people pressed the button faster when recognizing
ant
, which is related to
bug
, than when recognizing
sew
, which is unrelated. Surprisingly, people were just as primed to recognize the word
spy
, which is, of course, related to
bug
, but only to the meaning that makes no sense in the context. It suggests that the brain knee-jerkingly activates both entries for
bug
, even though one of them could sensibly be ruled out beforehand. The irrelevant meaning is not around long: if the test word appeared on the screen three syllables after
bugs
instead of right after it, then only
ant
was recognized quickly;
spy
was no longer any faster than
sew
. Presumably that is why people deny that they even entertain the inappropriate meaning.
The psychologists Mark Seidenberg and Michael Tanenhaus showed the same effect for words that were ambiguous as to part-of-speech category, like
tires
, which we encountered in the ambiguous headline
Stud Tires Out
. Regardless of whether the word appeared in a noun position, like
The tires…
, or in a verb position, like
He tires…
, the word primed both
wheels
, which is related to the noun meaning, and
fatigue
, which is related to the verb meaning. Mental dictionary lookup, then, is quick and thorough but not very bright; it retrieves nonsensical entries that must be weeded out later.