Mind Hacks™: Tips & Tools for Using Your Brain (25 page)

Read Mind Hacks™: Tips & Tools for Using Your Brain Online

Authors: Tom Stafford,Matt Webb

Tags: #COMPUTERS / Social Aspects / Human-Computer Interaction

BOOK: Mind Hacks™: Tips & Tools for Using Your Brain

4.66Mb size Format: txt, pdf, ePub

Speech Is Broadband Input to Your Head

Once your brain has decided to classify a sound as speech, it brings online a
raft of tricks to extract from it the maximum amount of information.

Speech isn’t just another set of noises. The brain treats it very differently from
ordinary sounds. Speech is predominantly processed on the left side of the brain, while
normal sounds are mostly processed on the right.

Note

This division is less pronounced in women, which is why they tend to recover better
from strokes affecting their left-sided language areas.

Knowing you’re about to hear language prepares your brain to make lots of assumptions
specially tailored to extract useful information from the sound. It’s this special way of
processing language-classified sounds that allows our brains to make sense of speech that is
coming at us at a rate of up to 50 phonemes a second — a rate that can actually be produced
only using an artificially sped-up recording.

In Action

To hear just how much the expectation of speech influences the sounds you hear, listen
to the degraded sound demos created by Bob Shannon et al. at the House Ear Institute (
http://www.hei.org/research/aip/audiodemos.htm
).

In particular, listen to the MP3 demo that starts with a voice that has been degraded
beyond recognition and then repeated six times, each time increasing the quality (
http://www.hei.org/research/aip/increase_channels.mp3
).

You won’t be able to tell what the voice is saying until the third or fourth
repetition. Listen to the MP3 again. This time your brain knows what to hear, so the words
are clearer much earlier. However hard you try, you can’t go back to hearing
static.

How It Works

Sentences are broken into words having meaning and organized by grammar, the system by
which we can build up an infinite number of complex sentences and subtle meanings from
only a finite pool of words.

Words can be broken down too, into
morphemes
, the smallest units
of meaning. “-ing” is a morpheme and makes the word “run” become “running.” It imparts
meaning. There are further rules at this level, about how to combine words into large
words.

Morphemes, too, can be broken down, into
phonemes
.
Phonemes are the basic sounds a language uses, so the word “run” has three: /r u n/. They
don’t map cleanly onto the letters of the alphabet; think of the phoneme at the beginning
of “shine.” Phonemes are different from syllables. So the word “running” is made up of two
morphemes and has five phonemes, but just two syllables (and seven letters of
course).

Languages have different sets of phonemes; English has about 40–45. There are more
than 100 phonemes that the human mouth is capable of making, but as babies, when we start
learning language, we tune into the ones that we encounter and learn to ignore the
rest.

People speak at about 10–15 phonemes per second, 20–30 if they’re speaking fast, and
that rate is easily understood by native speakers of the same language (if you
fast-forward recorded speech, we can understand up to 50 phonemes per second). Speech this
fast can’t contain each sound sequentially and independently. Instead, the sounds end up
on top of one another. As you’re speaking one phoneme, your tongue and lips are halfway to
the position required to speak the next one, anticipating it, so words sound different
depending on what words are before and after. That’s one of the reasons making good speech
recognition software is so hard.

The other reason software to turn sounds into words is so hard is that the layers of
phonemes, morphemes, and words are messy and influence one another. Listeners know to
expect certain sounds, certain sound patterns (morphemes), and even to expect what word is
coming next. The stream of auditory input is matched against all of that, and we’re able
to understand speech, even when phonemes (such as /ba/ and /pa/, which can also be
identified by looking at lip movements
[
Hear with Your Eyes: The McGurk Effect
]
) are very similar and
easily confused. The lack of abstraction layers — and the need to understand the meaning of
the sentence and grammar just to figure out what the phonemes are — is what makes this
procedure so hard for software.

It’s yet another example of how expectations influence perception, in a very
fundamental way. In the case of auditory information, knowing that the sound is actually
speech causes the brain to route the information to a completely separate region than the
one in which general sound processing takes place. When sound is taken to the speech
processing region, you’re able to hear words you literally couldn’t possibly have heard
when you thought you were just hearing noise, even for the same sound.

To try this, play for a friend synthesized voices made out of overlapping sine-wave
sounds (
http://www.biols.susx.ac.uk/home/Chris_Darwin/SWS
). This site has a number of recorded sentences and, for each one, a generated,
artificial version of that sound pattern. It’s recognizable as a voice if you know what it
is, but not otherwise.

When you play the sine-wave speech MP3 (called SWS on the site) to your
friend, don’t tell her it’s a voice. She’ll just hear a beeping sound. Then let her hear
the original voice of the same sentence, and play the SWS again. With her new knowledge,
the sound is routed to speech recognition and will sound quite different. Knowing that the
sound is actually made out of words and is English (so it’s made out of guessable phonemes
and morphemes), allows the whole recognition process to take place, which couldn’t have
happened before.

See Also

Mondegreens occur when our phoneme recognition gets it oh-so-wrong, which happens
a lot with song lyrics — so called from mishearing “and laid him on the green” as “and
Lady Mondegreen” (
http://www.sfgate.com/cgi-bin/article.cgi?file=/chronicle/archive/1995/02/16/DD31497.DTL
). SF Gate keeps an archive of misheard lyrics (
http://www.sfgate.com/columnists/carroll/mondegreens.shtml
).

Give Big-Sounding Words to Big Concepts

The sounds of words carry meaning too, as big words for big movements
demonstrate.

Steven Pinker, in his popular book on the nature of language,
The Language
Instinct
¹, encounters the frob-twiddle-tweak continuum as a way of talking about
adjusting settings on computers or stereo equipment. The Jargon File, longtime glossary for
hacker language, has the following under
frobnicate
(
http://www.catb.org/~esr/jargon/html/F/frobnicate.html
):

Usage: frob, twiddle, and tweak sometimes connote points along a continuum. ‘Frob’
connotes aimless manipulation; twiddle connotes gross manipulation, often a coarse search
for a proper setting; tweak connotes fine-tuning. If someone is turning a knob on an
oscilloscope, then if he’s carefully adjusting it, he is probably tweaking it; if he is
just turning it but looking at the screen, he is probably twiddling it; but if he’s just
doing it because turning a knob is fun, he’s frobbing it.
²

Why frob first? Frobbing is a coarse action, so it has to go with a big lump of a word.
Twiddle is smaller, more delicate. And tweak, the finest adjustment of all, feels like a
tiny word. It’s as if the actual sound of the word, as it’s spoken, carries meaning
too.

In Action

The two shapes in
Figure 4-3
are a
maluma
and a
takete
. Take a look. Which is
which?

Note

Don’t spoil the experiment for yourself by reading the next paragraph! When you try
this out on others, you may want to cover up all but the figure itself.

Figure 4-3. One of these is a “maluma,” the other a “takete” — which is which?

If you’re like most people who have looked at shapes like these since the late 1920s,
when Wolfgang Köhler devised the experiment, you said that the shape on the left is a
“takete,” and the one on the right is a “maluma.” Just like “frob” and “tweak,” in which
the words relate to the movements, “takete” has a spiky character and “maluma” feels
round.

How It Works

Words are multilayered in meaning, not just indices to some kind of meaning dictionary
in our brains. Given the speed of speech, we need as many clues to meaning as we can get,
to make understanding faster. Words that are just arbitrary noises would be wasteful.
Clues to the meaning of speech can be packed into the intonation of a word, what other
words are nearby, and the sound itself.

Brains are association machines, and communication makes full use of that fact to
impart meaning.

In
Figure 4-3
, the more rounded shape
is associated with big, full objects, objects that tend to have big resonant cavities,
like drums, that make booming sounds if you hit them. Your mouth is big and hollow,
resonant to say the word “maluma.” It rolls around your mouth.

On the other hand, a spiky shape is more like a snare drum or a crystal. It
clatters and clicks. The corresponding sound is full of what are called
plosives
, sounds like
t
- and
k-
that involve popping air out.

That’s the association engine of the brain in action. The same goes for “frob” and
“tweak.” The movement your mouth and tongue go through to say “frob” is broad and coarse
like the frobbing action it communicates. You put your tongue along the base of your mouth
and make a large cavity to make a big sound. To say “tweak” doesn’t just remind you of
finely controlled movement, it really entails more finely controlled movement of the
tongue and lips. Making the higher-pitched noise means making a smaller cavity in your
mouth by pushing your tongue up, and the sound at the end is a delicate movement.

Test this by saying “frob,” “twiddle,” and “tweak” first thing in the morning, when
you’re barely awake. Your muscle control isn’t as good as it usually is when you’re still
half-asleep, so while you can say “frob” easily, saying “tweak” is pretty hard. It comes
out more like “twur.” If you’re too impatient to wait until the morning, just imagine it
is first thing in the morning — as you stretch say the words to yourself with a yawn in your
voice. The difference is clear; frobbing works while you’re yawning, tweaking
doesn’t.

Aside from denser meaning, these correlations between motor control (either moving
your hands to control the stereo or saying the word) and the word itself may give some
clues to what language was like before it was really language. Protolanguage, the system
of communication before any kind of syntax or grammar, may have relied on these metaphors
to impart meaning.
³For humans now, language includes a sophisticated learning system in which,
as children, we figure out what words mean what, but there are still throwbacks to the
earlier time:
onomatopoeic
words are ones that sound like what they
mean, like “boom” or “moo.” “Frob” and “tweak” may be similar to that, only drawing in
bigness or roundness from the visual (for shapes) or motor (for mucking around with the
stereo) parts of the brain.

In Real Life

Given the relationship between the sound of a word, due to its component phonemes and
its feel (some kind of shared subjective experience), sound symbolism is one of the
techniques used in branding. Naming consultants take into account the maluma and takete
aspect of word meaning, not just dictionary meaning, and come up with names for products
and companies on demand — for a price, of course. One of the factors that influenced the
naming of the BlackBerry wireless email device was the
b-
sound at
the beginning. According to the namers, it connotes reliability.
⁴

End Notes

Pinker, S. (1994).
The Language Instinct: The New Science of
Language and Mind
. London: Penguin Books Ltd.
The online hacker Jargon File, Version 4.1.0, July 2004 (
http://www.catb.org/~esr/jargon/index.html
).
This phenomenon is called phonetic symbolism, or phonesthesia. Some
people have the perception of color when reading words or numbers, experiencing a
phenomenon called synaesthesia. Ramachandran and Hubbard suggest that synaesthesia is
how language started in the first place. See: Ramachandran, V. S., & Hubbard,
E. M. (2001). Synaesthesia — a window into perception, thought and language.
Journal of Consciousness Studies, 8
(12), 3–34. This can also be
found online at
http://psy.ucsd.edu/chip/pdf/Synaesthesia%20-%20JCS.pdf
.
Begley, Sharon. “Blackberry and Sound Symbolism” (
http://www.stanford.edu/class/linguist34/Unit_08/blackberry.htm
), reprinted from the
Wall Street Journal
, August 26,
2002.

See Also

Naming consultancies were especially popular during the 1990s dotcom boom. Alex
Frenkel took a look for
Wired
magazine in June 1997 in
“Name-o-rama” (
http://www.wired.com/wired/archive/5.06/es_namemachine.html
).

Other books

Broken Episode One by Odette C. Bell

THIS TIME (The Grace Allen Series Book Three Paranormal Romance) by Basso, Alisha

Unforgettable by von Ziegesar, Cecily

At The King's Command by Susan Wiggs

Whispers of Death by Alicia Rivoli

After Peaches by Michelle Mulder

The Shaman: And other shadows by Manzetti, Alessandro

A Step Toward Falling by Cammie McGovern

Crushing Beauty (Harbingers of Sorrow MC): Vegas Titans Series by Loren, Celia

Whiplash by Yvie Towers