It Began with Babbage (50 page)

Read It Began with Babbage Online

Authors: Subrata Dasgupta

BOOK: It Began with Babbage
2.86Mb size Format: txt, pdf, ePub
V

Simon would later commemorate December 15, 1955, as the day of the birth of computer-based heuristic reasoning.
35
He had begun to analyze the proof of one of the theorems in the
Principia
and, by early December, some ideas were emerging about the nature of the heuristics Whitehead and Russell had used.
36
On December 15, he simulated successfully by hand the execution of a program to generate a proof of one of the theorems in the
Principia
. So the basic strategies used in LT were discovered by “actual pencil-and-paper work.”
37

Principia Mathematica
began with fundamental (“atomic”) concepts such as variables (denoted by
p, q,
… and so forth), a set of connectives (denoted by ¬ [not], + [or], —> [implies], and => [entails]) that link variables to form expressions, definitions, a set of five axioms, and three rules of inference. For our purposes here, to illustrate how LT uses heuristic reasoning, it is sufficient just to state one definition, one of the axioms, and two of the rules of inference (
Table 14.1
).

Suppose LT is given the expression

E:
p
→
¬p => ¬p

and asked to obtain a proof for it.
38
As it happens, a proof for
E
would involve the following steps:

TABLE
14.1 A Tiny Axiomatic System

Definition
D
:
p → q
def
¬p v q

Axiom
A
:
r
v
r
=>
r

Inference Rules:

Rule of Substitution
: If
A(p)
is any true expression containing the variable
p
and
B
is any expression, then
A(B)
is also a true expression.

Rule of Replacement
: Any expression may be replaced by its definition.

1. By axiom
A
:
p
v
p
=>
p

2. Substituting ¬
p
for
p
: ¬
p
v ¬
p
=> ¬
p

3. Replacement of left subexpression:
p
→ ¬
p
=> ¬
p,
which is the expression
E
.

The question is: How can LT search for a proof of expression
E
without considering
all
the valid substitutions involving the five axioms?
39
The heuristic Newell and Simon used was to “work backward” from the expression to be proved—from the characteristics of the expression, “cues” are extracted about the most promising path to pursue.
40
Newell and Simon called this heuristic the
method of substitution
—later, in the literature of heuristic programming, it would viewed as an instance of what came to be called
backward chaining
. In the case of LT, it entailed the search for an axiom or theorem similar to the expression to be proved. If a similarity is found, then the program tries to match the expression to the similar axiom or theorem by appropriate transformations of the expression; if successful, the expression is proved and the chain of transformations
is
the proof. Otherwise, if no similar axiom or theorem is found, the strategy fails.
41

By “similarity” Newell and Simon meant a structural matching between the expression to be proved and one or more of the axioms or theorems already proved. Precise “measures” of similarity were defined. Informally, these measures would identify the expressions

p
→ ¬
p
=> ¬
p

p
v
p
=>
p

as similar because they have the same structure—pattern of variables and connectives—whereas

p
=>
q
v
p

p
v
p
=>
p

would be dissimilar because their structures do not match.

To illustrate the method of substitution heuristic, we can follow how LT produced a proof of the expression

E:
p
→ ¬
p
=> ¬
p

Because of their structural similarity, LT finds a match between
E
and the axiom

A:
r
v
r
=>
r

However, on the left hand side of =>, there is a mismatch between the connectives → in
E
and v in
A
. To make them match, the v in
A
must change to →. This can be done by applying the definition
D
. But, before that can be done, a ¬ must be placed before
r
in
A
—that is, by substituting a variable ¬
t
for
r
in
A
according to the rule of substitution. This produces

A
′
:
¬t
v ¬
t
=> ¬
t

Now definition
D
can be applied to change ¬
t
v ¬
t
to
t
→
¬t
. The new situation is

E:
p
→
¬p
=> ¬
p

A
′′
:
t
→ ¬
t
=> ¬
t

The only difference now between
E
and
A
″ is that there is a variable
p
in the former and a variable
t
in the latter. Applying the rule of substitution,
p
can be substituted for
t
in
A
″ resulting in

A*:
p
→ ¬
p
=> ¬
p

There is now a complete match between
E
and
A*
; a proof of
E
has been obtained. In this process, the method of substitution relied on cues gleaned from the features of the theorem to be proved. Such cues reduced the amount of search but, of course, there was no prior guarantee of success. The cues may have led down a blind alley. Ultimately, the criterion of success for a heuristic is whether it actually works.

In fact, the method of substitution did not work all the time. Of 67 theorems in the second chapter of the
Principia
, only 20 could be proved using this heuristic. Other heuristic principles would be needed for the other theorems.
42

VI

LT established a bridge between human and machine cognition, so much so that Simon would write in October 1956 to Bertrand Russell of this program and what it had achieved. To which Russell
replied dryly that had he and Whitehead known of this possibility, they would not have wasted the years “doing it by hand.”
43

LT was a means to an end. Newell and Simon were interested in universalizing the insights they had gained in designing this program to the realm of human problem solving in general. Moreover, the paper on LT was published in a journal on information theory, a mathematical branch of communication engineering (see
Chapter 11
, Section VI), and was unlikely to be read by psychologists, those who might be most interested professionally in the nature of heuristic thinking.

Two years later, in 1958, Allen Newell, Cliff Shaw, and Herbert Simon published an article titled “Elements of a Theory of Human Problem Solving” in
Psychological Review
, a prominent American journal in psychology. Their theory, they stated, “explain[ed] problem-solving behavior in terms of what we shall call
information processes
.”
44
It was an explicitly computational theory to explain the observed or observable behavior of organisms.
45

For Simon and Newell, the terms
information processing
and
symbol processing
were synonymous. Over time, they seemed to prefer the latter term. If, in their earliest joint publications, they chose to use
information processing
, their last joint publication (in 1975) spoke firmly of symbol systems, which are at the core of intelligent behavior.
46

Let us then refer to their theory as a
symbol processing model
(SPM) of cognition. SPM absorbs and generalizes Simon's previous model of the “universal decision maker” into a “symbolic problem solver.” SPM was a theory of symbolic problem solving regardless of whether the problem solver was a human being, a computer, or some other kind of organism. SPM represented a certain
class
of problem-solving mechanisms of which LT was a concrete exemplar. We have seen that Simon's theory of the decision maker was entwined with a satisficing activity in which heuristics were used to overcome bounded rationality. So, also, SPM viewed problem solving as a heuristics-driven enterprise. LT did not go about its task using “simple algorithms” that applied “brute force” to search exhaustively for a solution to a given problem.
47
This was what “conventional” computers did. The algorithmic approach was not feasible in the kind of problem solving in which LT was engaged. In finding a proof for a theorem, the choices of inferences at each stage were extremely large—an embarrassment of riches—from which the program had to select an inference that seemed most promising for the purpose of arriving at a proof. Thus, the method of proof for LT used heuristics. It also used a kind of
learning
in the sense that, once LT proved a theorem, this was stored away in its memory as a potential aid in proving other theorems.
48

SPM, in fact, provided significant insight into how the symbolic processing and mental simulation on which Kenneth Craik had speculated in his
The Nature of Explanation
(1943) could be envisioned
more precisely
(see
Chapter 11
, Section II)—namely, in the form of a heuristic computer program executed by a physical machine. Here was a model of how symbol structures might be represented in a material entity and how they might be manipulated, processed, and transformed into other symbol structures. Furthermore,
SPM suggested how the computer could serve as an
experimental apparatus
, a kind of microscope for the inner eye, with which one could “observe” mental phenomena. By programming the computer one could
simulate
specific higher level cognitive tasks—language understanding, chess playing, scientific concept formation, and so on—a concrete realization of Turing's thought experiment of 1950 (see
Chapter 11
, Section XI). If the simulation produced behavior that was consistent with what was predicted, then the computational (or symbol processing) model on which the program was based could become a theory of how humans might think and carry out cognitive tasks.

VII

Curiously, in their 1956 paper on LT, Newell and Simon did not mention the term
artificial intelligence
. This term, in fact, emerged the year before in a document describing a “summer research project on artificial intelligence” to be held in Dartmouth College in Hanover, New Hampshire, in summer 1956. The author of this proposal was John McCarthy (1927–2011).
49

McCarthy was an assistant professor of mathematics at Dartmouth College. Later, he would move to MIT before joining Stanford University, where he founded their artificial intelligence laboratory. McCarthy is generally regarded as the coiner of the term “artificial intelligence”, which, in its original sense, meant a kind of intelligence that contrasted to “natural intelligence.” As the Dartmouth proposal explained, artificial intelligence was a phenomenon wherein any aspect of mental activity deemed a manifestation of intelligent action could, at least in principle, be so precisely specified that a computer could be programmed to simulate it.
50

As we have seen, the possibility and the idea of machine intelligence had been in the minds of many ever since computing machines began to be built. Torres y Quevedo speculated on it in 1920 (see
Chapter 3
, Section IX); in 1943, Craik dwelled on the idea of human thinking in symbol processing terms (see
Chapter 11
, Section II). But, with the invention of the electronic digital computer, we find more interesting ideas—Claude Shannon's thoughts on computer chess in 1949, von Neumann comparing the brain with the computer, and Alan Turing's speculations in 1950 on what it would take for a computer to “fool” a human into thinking it was engaged with another human (see
Chapter 11
, Sections IV–VII).

It seems fair to say that LT was a seminal invention that transposed artificial intelligence from the realm of speculation into the realm of the real. LT was a working liminal artifact, a computer program that simulated the way human logicians went about proving theorems. It was an
actual
piece of artificial intelligence. The symbol processing model Newell, Shaw, and Simon presented in 1958 was a theory of problem solving.

It was a theory of
human
problem solving, in fact, which explained how Craik's “symbol processing in the head” could happen. As such, the Newell–Shaw–Simon paper of
1958 was seminal in the development of cognitive science, the multidisciplinary study of cognition. If, as some people believe, computational processes are at the core of cognition (in humans and animals)
51
—although not everyone does
52
—then the Newell–Shaw–Simon paper was at the very vanguard of this belief. It heralded the beginning of the “information processing paradigm” in cognitive psychology.
53
At the same time, LT was seminal to the development of the idea of artificial intelligence, both the phenomenon as envisioned by McCarthy and colleagues in 1955 and as a branch—a subdiscipline—of computer science.

Other books

The Gorgon by Kathryn Le Veque
Diamond Willow by Helen Frost
War of the Worlds 2030 by Stephen B. Pearl
The Professor's Student by Helen Cooper
Salem's Cipher by Jess Lourey