The Universal Sense (14 page)

Authors: Seth Horowitz

BOOK: The Universal Sense

3.13Mb size Format: txt, pdf, ePub

That was maybe a thirty-second event. As with the sound walk I described earlier in the book, it took hundreds of words to describe the barest outline of the acoustics involved, events that you normally ignore but which involve the motion of trillions of atoms, the mechanical responses of tens of thousands of hair cells, and the activation of millions if not billions of neurons to register events that you normally don’t even bother passing from sensory memory to consciousness. But there is a reason for this. If you paid equal attention to everything, with no automatic ability to parse out what was relevant to your needs, you would soon be overwhelmed by trivia both external and internal.

The mechanisms of attention have mostly been determined from studies using a technique called dichotic hearing, which simply means listening with two ears. An experiment in dichotic hearing goes something like this. Let’s suppose you are presented with two different sounds, A and B, at different frequencies at different rates and times. If A and B are widely separated tones and presented randomly between your two ears, you tend to
perceive them as random sounds. However, if you clump the two streams by some perceptual feature, such as their frequency separation (how far apart they are in pitch), their timbre (the fine structure of their sound), their relative loudness (quiet versus soft), or their location in space (left or right ear), your brain starts organizing them into separate acoustic streams. So if you are playing A-sharp on a clarinet every half second in the left ear and D-flat on a flute every three-quarters of a second in your right ear, you will group these stimuli into two separate auditory events—a very dull clarinet student on the left and an equally dull but distinguishable flute player on your right. But even if you start playing both sounds at the same time with the same rate, you will still separate them into different streams because your ears are differentiating not just the gross time of arrival of the events but the fine temporal structure of the sounds and their absolute frequency content. This type of study represents the simplest form of auditory scene analysis and one of the most basic measures of attention. As an experiment, it has stood up through time and changing techniques, with evidence ranging from early human and animal psychophysics and EEG through the most current studies using magnetoencephalograpy and fMRI.

But presenting individual tones varied by time, frequency, timbre, and position is a lab version of a caged hunt—it’s not the kind of situation you run into in the real world. But it is the basis of a very common and robust effect called the “cocktail party effect.” The idea is that even in a crowded room with lots of voices and background noise, you can still follow an individual voice (within limits, of course). The basis is again the synchronous responding to specific features of a complex sound even in noise that should mask it. It also highlights another aspect of auditory attention. In a room full of people speaking,
you have a lot of overlap of sounds: a mix of male and female voices will have fundamental frequencies ranging from 100 to 500 Hz, some timbre features in common because they are all being generated from human vocal folds, different loudnesses depending on the distance from the listener and how loud they speak, and of course different locations. Amidst all of that, if you notice your significant other talking to someone, the familiarity of his or her specific voice and speech patterns (which drive the timing of the sounds made while talking) will activate auditory neural patterns that have been activated numerous times before.

Hearing something numerous times, even with the great variability in acoustic specifics that occurs in speech, not only causes auditory and higher-center neurons to respond synchronously but actually rewires the synapses in your auditory system to improve the efficiency of responding to those specific traits. This is a general form of neural learning called Hebbian plasticity. Neurons are not just simple connected structures that sum up input from earlier neurons and fire or suppress firing. Neurons are extraordinarily complicated biochemical factories, constantly synthesizing or breaking down neurotransmitters, growth hormones, enzymes, and receptors for certain brain chemicals, and constantly up- and downregulating them based on the demands placed on them during tasks. But the most remarkable thing about neurons is that they will actually grow new processes, so they can change their wiring pattern. Neurons that often fire in synchrony on a regular basis, such as those exposed to harmonic sounds commonly found in speech or music, will interconnect more so that they can more easily influence each other and work together. So something familiar in the midst of noise will jump out of the background by activating a specific population of cells that “recognize” this stimulus.

“Those that fire together wire together” is a general neural principle, but the difference between sound and vision is highlighted by comparing the cocktail party effect with the visual equivalent—a Where’s Waldo? image puzzle. Trying to find the figure wearing the bright red-and-white striped hat and shirt in a complex visual scene usually takes at least thirty seconds (and sometimes much, much more; at times I could swear he is hiding in a cave in Afghanistan), whereas isolating a voice of interest in the cocktail party milieu usually takes place in well under a second.

But the cocktail party effect has an evil twin, one you are likely to run into. You’re on a train or a bus, trying to read, sleep, or just not look at the guy across from you, but the person behind you keeps chattering into his cell phone. Whether he does it loudly or even just as a constant soft susurration, we still find listening to someone else’s half of a conversation to be consistently annoying. A recent study by Lauren Emberson and colleagues found out why, and it has to do with the dark side of attention. They discovered that while hearing a normal conversation was not significantly distracting, hearing a half conversation—a “halversation,” as they called it—caused a serious decrease in cognitive performance. Their hypothesis was that background monitoring of unpredictable sounds results in more distraction for a listener engaged in other tasks. Because you can’t predict the direction of half a conversation, you get more unexpected stimuli, and thus more distraction.

This is one of the problems of having a sensory system that is always “on.” Your auditory system is constantly monitoring the background for change. A sudden change in the sensory environment can break the brain’s attention to a task and redirect it. Was that a footstep behind me in the dark alley, or just an echo
from the wall across the street? Is that sudden howling a coyote looking for a neighborhood cat, or just the blind beagle down the block trying to get its owner to let it in? Sound is our alarm system twenty-four hours a day. It is the only sensory system that is still reliable even while we sleep (which probably served our ancestors very well when they were hoping not to be killed in their sleep by predators). A sudden noise tells us something
happened
—and the auditory system (as opposed to vision) operates quickly enough to provide sufficient synchronized input to determine whether the source of the sound is familiar or if we need to link it to additional sensory and attentional processes to let us reconstruct what that something might be even if we can’t see it.

This is important because your other senses—vision, smell, taste, touch, and balance—are all limited in range and scope. Unless you turn your head (which has all kinds of consequences), binocular vision is limited to 120 degrees, with about another 60 degrees of peripheral vision. A smell has to be particularly concentrated for us to detect it at any distance, and even then humans have very limited ability to localize the source of a scent. Taste, touch, and balance are all limited to the extent of our bodies. But on a good day outside without any temperature inversions or earbuds blocking your ears, you can hear anything within a kilometer or so (about six-tenths of a mile)—if you are standing on solid ground (or floating on water), that’s a hemisphere of about 260 million cubic meters or about the volume of 1,300 zeppelins, in case you measure things that way. But if you’re standing outside studying for a test and your phone rings, your attention will swing to that, which is why the zeppelin lands on you—you’re paying attention to the familiar if distracting ringtone and ignoring the more slowly encroaching shadow of the descending (and very quiet) zeppelin.

This scenario shows the difference between the two types of attention your ears and other senses have to contend with:
goal-directed attention
(listening closely to your phone conversation as you enter an area with spotty coverage) and
sensory-directed attention
(being unable to focus on your conversation because the man talking into his cell phone has said the word “bomb” three times in one minute while you’re waiting for your flight to take off). Goal-driven attention makes us focus our sensory and cognitive abilities on a limited set of inputs and can be driven by any of our senses. Most of the time, humans are paying default attention to vision. You look around, reach for things by visual guidance (even if it’s to turn up the volume), and get instantly irritated if the lights suddenly go out because some idiot installed infrared motion detectors to save electricity. But any sensory modality can serve for this type of attention. You can follow your nose to find the source of the awful smell coming from the bathroom. You can use your taste buds to keep altering a recipe until it tastes right. You can focus on putting one foot directly in front of the other and ignore the six-foot drop on either side of you if you are showing off by walking on top of a wall. And of course you can listen intently to try to figure out the guitar lick that has been killing your score in Guitar Hero. In all cases of this kind of attention, you highlight the input from the sensory modality (or modalities) that is providing the most information about the task you have consciously decided is the most important.

But stimulus-based attention is about grabbing and redirecting your attention from elsewhere, including goal-directed behaviors. It too can be driven by any sensory system—a sudden flicker in your peripheral vision, the smell of smoke, a sudden touch that makes you jump out of your skin. The redirection of
attention based on a stimulus requires that something be novel and sudden—in other words, startling. A startle is the most basic form of attentional redirection, and it requires even less of your brain than the simplest auditory scene analysis. Being startled is something that happens to all of us: you’re concentrating on something and then there’s a sudden noise, or someone taps you on the shoulder, or (if you are in California) the ground trembles. And in less than 10 milliseconds (1/100 of a second), you do what I did when I heard the coyote-induced splash: you jump, your heart rate and blood pressure soar, you hunch your shoulders, and your attention swings as you try to find the source of the disturbance. You can be startled by sound, touch, or balance but not by vision, taste, or smell.
²³This is because these first three sensory systems are
mechanosensory
systems, relying on rapid mechanical opening of neurotransmitter channels that fire a very
fast,
evolutionarily very old neuronal circuit to activate spinal motor neurons and arousal circuits in your brain.

Every vertebrate has the startle circuit (with variations in expression), as it is a very adaptive way of putting an organism on guard for something novel. According to seminal work by Michael Davis and colleagues in rats (and subsequently confirmed in primates and humans), the mammalian auditory startle circuit makes its way through a five-neuron circuit, from the cochlea to the ventral cochlear nucleus to the nucleus of the lateral lemniscus to the reticular pontine nucleus and then down to spinal interneurons and finally the motor neurons. Five synapses in 1/100
second to go from a sudden sound to a sudden jump, putting you in a defensive posture with tightened, flight-ready muscles, sometimes emitting a loud vocalization (that’s especially true of my wife). The auditory startle is common in vertebrates because it is a very successful evolutionary adaptation to an unseen event. It lets us get our bearings and get the hell out of there, or at least widen our attention to figure out what the noise was.

But being startled doesn’t necessarily make you afraid. It does increase your sense of arousal—the physiological and psychological state that heightens everything from senses to emotional response. If you turn around and see that it’s your friend who snuck up behind you and yelled “boo” while you were focused on the scary movie, then the sensation of fright goes away and what you feel is a sense of contentment, since she totally deserved getting your drink flung in her face from the reflexive raising of your arms as you turned your head. But what if you turn around and can’t see anything around that might have said “boo”—no idiot friend with a peculiar sense of humor, no speakers in the ceiling, nothing? Then arousal continues, heightening everything, and emotion starts kicking in.

Emotions are a tricky subject to analyze in neuroscientific terms. Emotions are how you
feel
—complex internal behaviors that affect how you respond to what happens next in your environment until that emotion changes or fades away. Scientists (and others including philosophers, musicians, filmmakers, educators, politicians, parents, and advertisers) have been studying and applying emotional information for centuries. There have been hundreds of studies and books and even entire schools of thought about what emotions are, how they work, why we have them, and how we can use them, usually by manipulating them in others. Each of these treatments has conflicts with the others—
even lists of basic emotions often have little agreement, ranging from four pairs of “basic” emotions and their opposites, as proposed by Robert Plutchik in the 1980s (joy-sadness, trust-disgust, fear-anger, surprise-anticipation), through forty-eight separate emotions in ten categories, as proposed by the HUMAINE group, creators of the Emotion Annotation and Representation Language. Given that there are disputes even about the basics of what emotions are and what causes them (whether physiological states, such as the James-Lange theory suggests, or cognitive ones, as proposed by Richard Lazarus), it is not surprising that attempts to identify the neurobiological underpinnings of emotions are contentious. But one thing that is consistent in studies of emotion using techniques ranging from nineteenth-century psychology through twenty-first-century neural imaging is that one of the most important and fastest-acting triggers for emotion is sound, distributed throughout the cortex by both tonotopic and non-tonotopic pathways from the medial geniculate. So how does sound trigger and contribute to specific emotional states?

Other books

Everybody's Autobiography by Gertrude Stein

Why Darwin Matters by Michael Shermer

Boyett-Compo Charlotte - Wind Tales 01 by Windfall

The Devil's Interval by Linda Peterson

More Stories to Make You Blush by Marie Gray

Tiempo de cenizas by Jorge Molist

The Malevolent Comedy by Edward Marston

Last Night by Meryl Sawyer

Night Swimmers by Betsy Byars

Playing Up by David Warner