Read The Language Instinct: How the Mind Creates Language Online
Authors: Steven Pinker
Note that if the subjects had been manipulating something resembling
verbal descriptions
of the letters, such as “an upright spine with one horizontal segment that extends rightwards from the top and another horizontal segment that extends rightwards from the middle,” the results would have been very different. Among all the topsy-turvy letters, the upside-down versions (180 degrees) should be fastest: one simply switches all the “top”s to “bottom”s and vice versa, and the “left”s to “right”s and vice versa, and one has a new description of the shape as it would appear right-side up, suitable for matching against memory. Sideways letters (90 degrees) should be slower, because “top” gets changed either to “right” or to “left,” depending on whether it lies clockwise (+ 90 degrees) or counterclockwise (- 90 degrees) from the upright. Diagonal letters (45 and 135 degrees) should be slowest, because every word in the description has to be replaced: “top” has to be replaced with either “top right” or “top left,” and so on. So the order of difficulty should be 0, 180, 90, 45, 135, not the majestic rotation of 0, 45, 90, 135, 180 that Cooper and Shepard saw in the data. Many other experiments have corroborated the idea that visual thinking uses not language but a mental graphics system, with operations that rotate, scan, zoom, pan, displace, and fill in patterns of contours.
What sense, then, can we make of the suggestion that images, numbers, kinship relations, or logic can be represented in the brain without being couched in words? In the first half of this century, philosophers had an answer: none. Reifying thoughts as things in the head was a logical error, they said. A picture or family tree or number in the head would require a little man, a homunculus, to look at it. And what would be inside
his
head—even smaller pictures, with an even smaller man looking at them? But the argument was unsound. It took Alan Turing, the brilliant British mathematician and philosopher, to make the idea of a mental representation scientifically respectable. Turing described a hypothetical machine that could be said to engage in reasoning. In fact this simple device, named a Turing machine in his honor, is powerful enough to solve any problem that any computer, past, present, or future, can solve. And it clearly uses an internal symbolic representation—a kind of mentalese—without requiring a little man or any occult processes. By looking at how a Turing machine works, we can get a grasp of what it would mean for a human mind to think in mentalese as opposed to English.
In essence, to reason is to deduce new pieces of knowledge from old ones. A simple example is the old chestnut from introductory logic: if you know that Socrates is a man and that all men are mortal, you can figure out that Socrates is mortal. But how could a hunk of matter like a brain accomplish this feat? The first key idea is a
representation
: a physical object whose parts and arrangement correspond piece for piece to some set of ideas or facts. For example, the pattern of ink on this page
Socrates is a man
is a representation of the idea that Socrates is a man. The shape of one group of ink marks, Socrates, is a symbol that stands for the concept of Socrates. The shape of another set of ink marks, isa, stands for the concept of being an instance of, and the shape of the third, man, stands for the concept of man. Now, it is crucial to keep one thing in mind. I have put these ink marks in the shape of English words as a courtesy to you, the reader, so that you can keep them straight as we work through the example. But all that really matters is that they have different shapes. I could have used a star of David, a smiley face, and the Mercedes Benz logo, as long as I used them consistently.
Similarly, the fact that the Socrates ink marks are to the left of the isa ink marks on the page, and the man ink marks are to the right, stands for the idea that Socrates is a man. If I change any part of the representation, like replacing isa with isasonofa, or flipping the positions of Socrates and man, we would have a representation of a different idea. Again, the left-to-right English order is just a mnemonic device for your convenience. I could have done it right-to-left or up-and-down, as long as I used that order consistently.
Keeping these conventions in mind, now imagine that the page has a second set of ink marks, representing the proposition that every man is mortal:
Socrates is a man
Every man is mortal
To get reasoning to happen, we now need a
processor
. A processor is not a little man (so one needn’t worry about an infinite regress of homunculi inside homunculi) but something much stupider: a gadget with a fixed number of reflexes. A processor can react to different pieces of a representation and do something in response, including altering the representation or making new ones. For example, imagine a machine that can move around on a printed page. It has a cutout in the shape of the letter sequence isa, and a light sensor that can tell when the cutout is superimposed on a set of ink marks in the exact shape of the cutout. The sensor is hooked up to a little pocket copier, which can duplicate any set of ink marks, either by printing identical ink marks somewhere else on the page or by burning them into a new cutout.
Now imagine that this sensor-copier-creeper machine is wired up with four reflexes. First, it rolls down the page, and whenever it detects some isa ink marks, it moves to the left, and copies the ink marks it finds there onto the bottom left corner of the page. Let loose on our page, it would create the following:
Socrates is a man
Every man is mortal
Socrates
Its second reflex, also in response to finding an isa, is to get itself to the right of that isa and copy any ink marks it finds there into the holes of a new cutout. In our case, this forces the processor to make a cutout in the shape of man. Its third reflex is to scan down the page checking for ink marks shaped like Every, and if it finds some, seeing if the ink marks to the right align with its new cutout. In our example, it finds one: the man in the middle of the second line. Its fourth reflex, upon finding such a match, is to move to the right and copy the ink marks it finds there onto the bottom center of the page. In our example, those are the ink marks ismortal. If you are following me, you’ll see that our page now looks like this:
Socrates isa man
Every man ismortal
Socrates ismortal
A primitive kind of reasoning has taken place. Crucially, although the gadget and the page it sits on collectively display a kind of intelligence, there is nothing in either of them that is itself intelligent. Gadget and page are just a bunch of ink marks, cutouts, photocells, lasers, and wires. What makes the whole device smart is the exact
correspondence
between the logician’s rule “If X is a Y and all Y’s are Z, then X is Z” and the way the device scans, moves, and prints. Logically speaking, “X is a Y” means that what is true of Y is also true of X, and mechanically speaking, X isa Y causes what is printed next to the Y to be also printed next to the X. The machine, blindly following the laws of physics, just responds to the shape of the ink marks isa (without understanding what it means to us) and copies other ink marks in a way that ends up mimicking the operation of the logical rule. What makes it “intelligent” is that the sequence of sensing and moving and copying results in its printing a representation of a conclusion that is true if and only if the page contains representations of premises that are true. If one gives the device as much paper as it needs, Turing showed, the machine can do anything that any computer can do—and perhaps, he conjectured, anything that any physically embodied mind can do.
Now, this example uses ink marks on paper as its representation and a copying-creeping-sensing machine as its processor. But the representation can be in any physical medium at all, as long as the patterns are used consistently. In the brain, there might be three groups of neurons, one used to represent the individual that the proposition is about (Socrates, Aristotle, Rod Stewart, and so on), one to represent the logical relationship in the proposition (is a, is not, is like, and so on), and one to represent the class or type that the individual is being categorized as (men, dogs, chickens, and so on). Each concept would correspond to the firing of a particular neuron; for example, in the first group of neurons, the fifth neuron might fire to represent Socrates and the seventeenth might fire to represent Aristotle; in the third group, the eighth neuron might fire to represent men, the twelfth neuron might fire to represent dogs. The processor might be a network of other neurons feeding into these groups, connected together in such a way that it reproduces the firing pattern in one group of neurons in some other group (for example, if the eighth neuron is firing in group 3, the processor network would turn on the eighth neuron in some fourth group, elsewhere in the brain). Or the whole thing could be done in silicon chips. But in all three cases the principles are the same. The way the elements in the processor are wired up would cause them to sense and copy pieces of a representation, and to produce new representations, in a way that mimics the rules of reasoning. With many thousands of representations and a set of somewhat more sophisticated processors (perhaps different kinds of representations and processors for different kinds of thinking), you might have a genuinely intelligent brain or computer. Add an eye that can detect certain contours in the world and turn on representations that symbolize them, and muscles that can act on the world whenever certain representations symbolizing goals are turned on, and you have a behaving organism (or add a TV camera and set of levers and wheels, and you have a robot).
This, in a nutshell, is the theory of thinking called “the physical symbol system hypothesis” or the “computational” or “representational” theory of mind. It is as fundamental to cognitive science as the cell doctrine is to biology and plate tectonics is to geology. Cognitive psychologists and neuroscientists are trying to figure out what kinds of representations and processors the brain has. But there are ground rules that must be followed at all times: no little men inside, and no peeking. The representations that one posits in the mind have to be arrangements of symbols, and the processor has to be a device with a fixed set of reflexes, period. The combination, acting all by itself, has to produce the intelligent conclusions. The theorist is forbidden to peer inside and “read” the symbols, “make sense” of them, and poke around to nudge the device in smart directions like some deus ex machina.
Now we are in a position to pose the Whorfian question in a precise way. Remember that a representation does not have to look like English or any other language; it just has to use symbols to represent concepts, and arrangements of symbols to represent the logical relations among them, according to some consistent scheme. But though internal representations in an English speaker’s mind don’t
have
to look like English, they
could
, in principle, look like English—or like whatever language the person happens to speak. So here is the question: Do they in fact? For example, if we know that Socrates is a man, is it because we have neural patterns that correspond one-to-one to the English words
Socrates, is, a
, and
man
, and groups of neurons in the brain that correspond to the subject of an English sentence, the verb, and the object, laid out in that order? Or do we use some other code for representing concepts and their relations in our heads, a language of thought or mentalese that is not the same as any of the world’s languages? We can answer this question by seeing whether English sentences embody the information that a processor would need to perform valid sequences of reasoning—without requiring any fully intelligent homunculus inside doing the “understanding.”
The answer is a clear no. English (or any other language people speak) is hopelessly unsuited to serve as our internal medium of computation. Consider some of the problems.
The first is ambiguity. These headlines actually appeared in newspapers:
Child’s Stool Great for Use in Garden
Stud Tires Out
Stiff Opposition Expected to Casketless Funeral Plan
Drunk Gets Nine Months in Violin Case
Iraqi Head Seeks Arms
Queen Mary Having Bottom Scraped
Columnist Gets Urologist in Trouble with His Peers
Each headline contains a word that is ambiguous. But surely the thought underlying the word is
not
ambiguous; the writers of the headlines surely knew which of the two senses of the words
stool, stud
, and
stiff
they themselves had in mind. And if there can be two thoughts corresponding to one word, thoughts can’t be words.
The second problem with English is its lack of logical explicitness. Consider the following example, devised by the computer scientist Drew McDermott:
Ralph is an elephant.
Elephants live in Africa.
Elephants have tusks.
Our inference-making device, with some minor modifications to handle the English grammar of the sentences, would deduce “Ralph lives in Africa” and “Ralph has tusks.” This sounds fine but isn’t. Intelligent you, the reader, knows that the Africa that Ralph lives in is the same Africa that all the other elephants live in, but that Ralph’s tusks are his own. But the symbol-copier-creeper-sensor that is supposed to be a model of you
doesn’t
know that, because the distinction is nowhere to be found in any of the statements. If you object that this is just common sense, you would be right—but it’s common sense that we’re trying to account for, and English sentences do not embody the information that a processor needs to carry out common sense.