The Happiness of Pursuit: What Neuroscience Can Teach Us About the Good Life (17 page)

BOOK: The Happiness of Pursuit: What Neuroscience Can Teach Us About the Good Life
10Mb size Format: txt, pdf, ePub
Dependencies All the Way Down
 
To understand and be understood is the hope of every newcomer to a linguistic community (as those of us who ever moved to a strange new country always remember). For one category of new arrivals, however, learning language touches, at least initially, upon weightier issues than fitting in at school or improving one’s chances of getting into college. This category consists of those citizens-in-the-making who face the task of learning a language, any language, for the very first time: newborn babies.
As creatures whose well-being depends (critically at first) on the goodwill of others, babies are eager to learn to interact with people and with their environment, and are good at it. At the same time, the memes that constitute language survive and flourish, along with other aspects of culture, insofar as they are good at being disseminated and learned. Our computational understanding of cognition allows us to state precisely what it means for learners of language to be good and for linguistic knowledge to be learnable. As in all learning, the common denominator here is statistical inference—the only way of gleaning knowledge from mere data.
25
Because knowledge is a means for organizing data, a baby learner who does not yet
know
much about the world is running the danger of being overwhelmed by what I called in Chapter 4 the maelstrom of sensory information. Or so it would seem: because knowledge transforms both the recollection of experience from which it is distilled and the appreciation of subsequent experience, the very same situation
feels
different to a baby and to a grown-up, forcing scientists to resort to guessing when they try to imagine what it is like to be a baby. A particularly memorable guess of this kind has been put forward by William James: “The baby, assailed by eye, ear, nose, skin and entrails at once, feels it all as one great blooming, buzzing confusion.”
26
Even if James guessed right and babies are indeed confused by the information assault, they do not remain so for long, which suggests that they are well prepared to deal with their experiences. As before, to give co-evolution its due, we should credit not just the learner but also the data by noting that babies’ experiences, in turn, are often enough structured so as to facilitate being dealt with. In the acquisition of language, this push-pull division of labor is readily apparent at every level and stage of the process.
Very high on any language learner’s agenda is learning the names of things. It may be, for instance, that the baby’s community refers to an object that looks like this
as co
a
a. Such an object/name pairing would be very easy to learn if the name were always attached to the object, as a kind of bar code (acoustic, gestural, or, as in a grocery store, printed). Alas for the learner, a dog (in Russia or elsewhere) may appear on the scene without being named, and the word that stands for it may occur in a conversation on which the baby is eavesdropping without its object being present. Even when the name and the object happen to co-occur, they may do so within a wide window of space-time that they share with various other objects and sounds.
The right thing to do in this situation is to treat suspected pairings of words with objects as provisional hypotheses and to look out for statistical evidence that would allow principled arbitration among them. This strategy would be infeasible if all possible hypotheses were entertained. Good learners that they are, babies solve this problem by cutting a lot of corners: they only ever consider the very few hypotheses that are compatible with some very strong prior assumptions about what is and is not likely to happen in their world.
27
Being good at learning does not mean being rash or foolhardy: such corner-cutting is fully endorsed by the Bayesian theory of statistical inference. (Like blind mole rats, babies are Bayesians without knowing it.) The Bayes Theorem, as you will remember, prescribes how the prior probability of a hypothesis should be modified by new data. The priors for word learning are themselves learned over evolutionary time, so that even an absolute novice learner can effectively rule out hypotheses that are incompatible with a few basic assumptions built into his or her brain.
One such assumption is that words name objects or events that “hang together” in space and time. This would rule out, among others, the hypothesis that co
a
a refers to a “combination” object, such as dog with a slipper in its mouth. With co-evolution in mind, we may observe that this assumption is derived from certain properties of the world we live in and has to do with the data’s contribution to its own learnability: statistically speaking, a word for “dog + slipper” would be hard to learn, as well as unable to compete with “dog” and “slipper.” Another assumption is that a hitherto unfamiliar word spoken by a caregiver refers to the most salient object in the present scene. This would rule out the hypothesis that co
a
a, uttered in a room with an excited golden retriever in it, refers to the batik on the wall.
28
Mastering the names of objects gradually makes it easier for babies to learn complex constructions—the multi-word patterns of usage that are the entry ticket to any human society. At this level too, the learners rely on assumptions that are themselves learned over many generations and are incorporated into the developing mind’s computational toolbox. The most general principle at work here is that of constrained reuse: learners expect that labels for objects, events, actions, attributes, and qualities, along with a smattering of “service” or function words such as “and” or “that,” will appear and reappear in various combinations that conform to certain statistical patterns.
The learners’ need to resort to statistics to find those patterns exerts selection pressure on the population of memes that give rise to the patterns (and thereby keep language well ordered and therefore usable). The co-evolutionary drive turns the learners’ expectations into a self-fulfilling prophecy: the patterns that thrive under this regime are those that are statistics-friendly. Meme selection pressure also determines what those patterns actually look like. Because the elements of language—phonemes, syllables, words—are digital (so as to promote error-free replication), the patterns in sequences of elements are necessarily defined in terms of the
dependence
of some elements on others. Formally, this corresponds precisely to the concept of conditional probability—our old friend from earlier chapters.
29
The simplest example of a dependence pattern is the idiom. For instance, the collocation
kick the bucket
is defined by the high probability of
bucket
, given that the preceding words are
kick the
. The pattern in this case is slightly more general:
kick the
___
bucket
, where the slot ___ may be left unfilled, or, with lower probability, may be occupied by the word
proverbial
(any other word there would destroy the idiomatic meaning of the expression).

Other books

The Bastards of Pizzofalcone by Maurizio de Giovanni, Antony Shugaar
An Oxford Tragedy by J. C. Masterman
Absolution by Murder by Peter Tremayne
The Girls by Helen Yglesias
A Knight of the Sacred Blade by Jonathan Moeller
The Heiress Bride by Catherine Coulter
Unlike Others by Valerie Taylor