Read The World Until Yesterday: What Can We Learn from Traditional Societies? Online
Authors: Jared Diamond
Why are languages vanishing at such a catastrophic rate? Does it really matter? Is our current plethora of languages good or bad for the world as a whole, and for all those traditional societies still speaking languages now at risk of vanishing? Many of you readers may presently disagree with what I just said, about language loss being a tragedy. Perhaps you instead think that diverse languages promote civil war and impede education, that the world would be better off with far fewer languages, and that high language diversity is one of those features of the world of yesterday that we should be glad to be rid of—like chronic tribal warfare, infanticide, abandonment of the elderly, and frequent starvation.
For each of us as individuals, does it do us good or harm to learn multiple languages? It certainly takes much time and effort to learn a language and become fluent in it; would we be better off devoting all that time and effort to learning more obviously useful skills? I think that the answers emerging to these questions about the value of traditional multilingualism, both to societies and to individuals, will intrigue you readers, as they intrigued me. Will this chapter convince you to bring up your next child to be bilingual, or will it instead convince you that the whole world should switch to speaking English as quickly as possible?
Before we can tackle those big questions, let’s start with some preamble about how many languages still exist today, how they developed, and where in the world they are spoken. The known number of distinct languages still spoken or recently spoken in the modern world is around 7,000. That huge total may astonish many readers, because most of us could name only a few dozen languages, and the vast majority of languages are unfamiliar to us. Most languages are unwritten, spoken by few people, and spoken far from the industrial world. For example, all of Europe west of Russia has fewer than 100 native languages, but the African continent and the Indian subcontinent have over 1,000 native languages each, the African countries of Nigeria and Cameroon 527 and 286 languages respectively, and the small Pacific island nation of Vanuatu (area less than 5,000 square miles) 110 languages. The world’s highest language
diversity is on the island of New Guinea, with about 1,000 languages and an unknown but apparently large number of distinct language families crammed into an area only slightly larger than Texas.
Of those 7,000 languages, 9 “giants,” each the primary language of 100 million or more people, account for over one-third of the world’s population. In undoubted first place is Mandarin, the primary language of at least 700 million Chinese, followed by Spanish, English, Arabic, Hindi, Bengali, Portuguese, Russian, and Japanese in approximately that sequence. If we relax our definition of “big languages” to mean the top 70 languages—i.e., the top 1% of all languages—then we have encompassed the primary languages of almost 80% of the world’s people.
But most of the world’s languages are “little” languages with few speakers. If we divide the world’s nearly 7 billion people by 7,000 languages, we obtain 1 million people as the average number of speakers of a language. Because that average is distorted by the 100-million-plus speakers of just 9 giant languages, a better measure of a “typical” language is to talk about the “median” number of speakers—i.e., a language such that half of the world’s languages have more speakers, and the other half have fewer speakers. That median number is only a few thousand speakers. Hence half of the world’s languages have under a few thousand speakers, and lots of them have between only 60 and 200 speakers.
But such discussions of numbers of languages, and numbers of language speakers, force us to confront the question that I anticipated in describing my New Guinea campfire language poll at the beginning of this chapter. What’s the difference between a distinct language and a mere dialect of another language? Speech differences between neighboring populations intergrade completely; neighbors may understand 100%, or 92%, or 75%, or 42%, or nothing at all of what each other says. The cut-off between language and dialect is often arbitrarily taken at 70% mutual intelligibility: if neighboring populations with different ways of speaking can understand over 70% of each other’s speech, then (by that definition) they’re considered just to speak different dialects of the same language, while they are considered as speaking different languages if they understand less than 70%.
But even that simple, arbitrary, strictly linguistic definition of dialects and languages may encounter ambiguities when we try to apply it in prac
tice. One practical difficulty is posed by dialect chains: in a string of neighboring villages ABCDEFGH, each village may understand both villages on either side, but villages A and H at opposite ends of the chain may not be able to understand each other at all. Another difficulty is that some pairs of speech communities are asymmetrical in their intelligibility: A can understand most of what B says, but B has difficulty understanding A. For instance, my Portuguese-speaking friends tell me that they can understand Spanish-speakers well, but my Spanish-speaking friends have more difficulty understanding Portuguese.
Those are two types of problems in drawing a line between dialects and languages on strictly linguistic grounds. A bigger problem is that languages are defined as separate not just by linguistic differences, but also by political and self-defined ethnic differences. This fact is expressed in a joke that one often hears among linguists: “A language is a dialect backed up by its own army and navy.” For instance, Spanish and Italian might not pass the 70% test for being ranked as different languages rather than mere dialects: my Spanish and Italian friends tell me that they can understand most of what each other says, especially after a little practice. But, regardless of what a linguist applying this 70% test might say, every Spaniard and Italian, and everybody else, will unhesitatingly proclaim Spanish and Italian to be different languages—because they have had their own armies and navies, plus largely separate governments and school systems, for over a thousand years.
Conversely, many European languages have strongly differentiated regional forms that the governments of their country emphatically consider mere dialects, even though speakers from the different regions can’t understand each other at all. My north German friends can’t make heads or tails of the talk of rural Bavarians, and my north Italian friends are equally at a loss in Sicily. But their national governments are adamant that those different regions should not have separate armies and navies, and so their speech forms are labeled as dialects and don’t you dare mention a criterion of mutual intelligibility.
Those regional differences within European countries were even greater 60 years ago, before television and internal migration began breaking down long-established “dialect” differences. For example, on my first visit to Britain in the year 1950, my parents took my sister Susan and
me to visit family friends called the Grantham-Hills in their home in the small town of Beccles in East Anglia. While my parents and their friends were talking, my sister and I became bored with the adult conversation and went outside to walk around the charming old town center. After turning at several right angles that we neglected to count, we realized that we were lost, and we asked a man on the street for directions back to our friends’ house. It became obvious that the man didn’t understand our American accents, even when we spoke slowly and (we thought) distinctly. But he did recognize that we were children and lost, and he perked up when we repeated the words “Grantham-Hill, Grantham-Hill.” He responded with many sentences of directions, of which Susan and I couldn’t decipher a single word; we wouldn’t have guessed that he considered himself to be speaking English. Fortunately for us, he pointed in one direction, and we set off that way until we recognized a building near the Grantham-Hills’ house. Those former local “dialects” of Beccles and other English districts have been undergoing homogenization and shifts towards BBC English, as access to television has become universal in Britain in recent decades.
By a strictly linguistic definition of 70% intelligibility—the definition that one has to use in New Guinea, where no tribe has its own army or navy—quite a few Italian “dialects” would rate as languages. That redefinition of some Italian dialects as languages would close the gap in linguistic diversity between Italy and New Guinea slightly, but not by much. If the average number of speakers of an Italian “dialect” had equaled the 4,000 speakers of an average New Guinea language, Italy would have 10,000 languages. Aficionados of the separateness of Italian dialects might credit Italy with dozens of languages, but no one would claim there to be 10,000 different languages in Italy. It really is true that New Guinea is linguistically far more diverse than is Italy.
How did the world end up with 7,000 languages, instead of our all sharing the same language? Already for tens of thousands of years before language spread by the Internet and Facebook, there has been ample opportunity
for language differences to disappear, because most traditional peoples have had contact with neighboring peoples, with whom they intermarry and trade, and from whom they borrow words and ideas and behaviors. Something must have caused languages, even in the past and under traditional conditions, to diverge and to remain separate, in the face of all that contact.
Here’s how it happens. Any of us over the age of 40 has observed that languages change even over the course of just a few decades, with some words dropping out of use, new words being coined, and pronunciation shifting. For instance, whenever I revisit Germany, where I lived in 1961, young Germans notice that they have to explain to me some new German words (e.g., the new word
Händi
for cell phones, which didn’t exist in 1961), and that I still use some old-fashioned German words that have been going out of use since 1961 (e.g.,
jener/jene
for “that/those”). But young Germans and I can still mostly understand each other well. Similarly, you American readers under the age of 40 may not recognize some formerly popular English words like “ballyhoo,” but in compensation you daily use the verb “to Google” and the participle “Googling,” which didn’t exist in my childhood.
After a few centuries of such independent changes in two geographically separate speech communities derived from the same original speech community, the communities develop dialects that may pose difficulties for each other to understand: e.g., the modest differences between American and British English, the bigger differences between the French of Quebec and of metropolitan France, and the still bigger differences between Afrikaans and Dutch. After 2,000 years of divergence, the speech communities have diverged so much as to be no longer mutually intelligible, although to linguists they are still obviously related—such as the French and Spanish and Romanian languages derived from Latin, or the English and German and other Germanic languages derived from proto-Germanic. Finally, after about 10,000 years, the differences are so great that most linguists would assign the languages to unrelated language families without any detectable relationships.
Thus, languages evolve differences because different groups of people independently develop different words and different pronunciations over the course of time. But the question remains why those diverged languages
don’t merge again when formerly separated people spread out and re-contact each other at speech boundaries. For instance, at the modern boundary between Germany and Poland, there are Polish villages near German villages, but the villagers still speak a local variety of either German or of Polish, rather than a German-Polish mish-mash. Why is that so?
Probably the main disadvantage of speaking a mish-mash involves a basic function of human language: as soon as you start to speak to someone else, your language serves as an instantly recognizable badge of your group identity. It’s much easier for wartime spies to don the enemy’s uniform than to imitate convincingly the enemy’s language and pronunciation. People who speak
your
language are
your
people: they’ll recognize you as a compatriot, and they’ll support you or at least not be immediately suspicious of you, whereas someone speaking a different language is apt to be regarded as a potentially dangerous stranger. That instant distinction between friends and strangers still operates today: just see how you (my American readers) react the next time that you’re in Uzbekistan, and you finally to your relief hear someone behind you speaking English with an American accent. The distinction between friends and strangers was even more important in the past (
Chapter 1
), often a matter of life and death. It’s important to speak the language of at least some community, so that there will be some group that considers you as “our own.” If you instead speak a mish-mash near a speech boundary, both groups may understand much of what you say, but neither group will consider you “one of our own,” and you can’t count on either group to welcome and protect you. That may be why the world’s speech communities have tended to remain thousands of separate languages, instead of the whole world speaking one language or forming one dialect chain.
Languages are distributed unevenly around the world: about 10% of the world’s area contains half of its languages. For instance, at the low-end extreme of language diversity, the world’s three largest countries—Russia, Canada, and China, each with an area of millions of square miles—have only about 100, 80, and 300 native languages respectively. But at the
high-end extreme of language diversity, New Guinea and Vanuatu, with areas of only 300,000 and 4,700 square miles respectively, have about 1,000 and 110 native languages. That means that one language is spoken over an average area of about 66,000, 49,000, and 12,000 square miles in Russia, Canada, and China respectively, but only over 300 and 42 square miles respectively in New Guinea and Vanuatu. Why is there such enormous geographic variation in language diversity?