Arrival of the Fittest: Solving Evolution's Greatest Puzzle (14 page)

Read Arrival of the Fittest: Solving Evolution's Greatest Puzzle Online

Authors: Andreas Wagner

BOOK: Arrival of the Fittest: Solving Evolution's Greatest Puzzle

5.5Mb size Format: txt, pdf, ePub

A hypercube is also well suited to accommodate the thousands of neighbors near each of the library’s texts. In a simple universe of three reactions, each of the library’s texts—the corner of a cube—has three adjacent corners as its neighbors. Take one of these texts, such as the string 100 in figure 8c, and you reach its neighbors via the edges leading from 100 to the adjacent corners. We get to them either by adding the third reaction to 100, which yields 101, or by adding the second reaction (110), or by eliminating the first reaction (000). All three neighbors—101, 110, and 000—differ from 100 in exactly one character. And what holds for one corner of the cube holds for any other corner: It has three neighbors. Likewise, in a 5,000-dimensional cube, each and every metabolism has as many neighbors as there are dimensions, five thousand in all. You can walk from each metabolic text in five thousand different directions, to find one of its five thousand neighbors in a single step. Each of these neighbors differs from the text in exactly one reaction. Either the neighbor has an additional reaction—in this case one entry of the string changes from 0 to 1—or it has one fewer reaction—one entry changes from 1 to 0.

Evolving organisms are like visitors to the metabolic library. Gene deletions and gene transfer allow them to walk through the library, to step from one metabolic text to another, often an immediate neighbor. All of a text’s neighbors form a
neighborhood
in this library, and such neighborhoods are as important for evolution as a city neighborhood is for people’s lives. City neighborhoods are useful because of proximity—everything is reachable within a few easy steps—and neighborhoods in the metabolic library are important for the same reason. Evolution can reach them in a few small steps, minor edits in a genotype. But residents of a city’s neighborhoods can walk in only four cardinal directions—north, south, east, or west—whereas evolution can head in five thousand directions. (Don’t even bother trying to visualize that.) And therefore the neighborhood of a metabolic text may be vastly more interesting, surprising, and diverse. This diversity will be crucial to understanding innovability, as we shall see shortly.

Over time, as alterations in an organism’s genotypic text accumulate, it walks farther and farther, to more and more distant shelves in the library. To gauge how far, we must be able to measure distance. Without that ability, we would be lost, and the library would become a useless maze of stacks—we could not find our way from one shelf to another.
³⁸Fortunately, the distance
D
that I had used to study the diversity of known metabolic texts does the job. It tells us how far apart in the library two metabolic texts are, and it already told us that some viable texts are very distant indeed. The next insight it provides is the real bombshell, though: We can travel enormous distances through the library and encounter very different stories with the same moral, everywhere.

One day we may know millions of metabolic texts, but even that number would be a tiny fraction of the hyperastronomical metabolic library, less than a few specks of dust in the universe, because the library contains many more metabolisms than the number of organisms that have existed on earth since life began. Even after 3.8 billion years of evolution, life has explored only a tiny fraction of the library.

For all of those billions of years, nature did not need to know what was around the next corner of the library for evolution to proceed. But if we humans want to
understand
the library, rather than simply live in it, we need to have some way to grasp where new and meaningful texts are. And we need a catalog that classifies texts, like the Dewey Decimal System, or the Library of Congress Classification, grouping books according to subject categories—Art History, Economics, Linguistics—with smaller subcategories such as Romance, Germanic, Slavic languages nested within them. Metabolic phenotypes, the possible meanings of a metabolic text, are the natural subject categories of this library. Their number is larger than those in a library of books, but that’s simply because the library itself is so vast.

A catalog is like a map for this library—it is a genotype-phenotype map that tells us where to find the genotypes with any one phenotype. Without this map, we do not know whether texts with the same subject are scattered or grouped—as they would be in a human library—whether the same shelf houses texts on different subjects, and so on. And because no librarian is in sight, we need to create this map ourselves, roam the library and explore it, like the ancient voyagers who mapped the earth and its continents on their journeys. The library’s huge size will prevent us from mapping every single text, but we can draw the contours of the continents, mountain ranges, rivers, lakes, and deserts, and hope that we can grasp the shape of the whole from their hazy outlines.

But where to start, and how to travel?

Here is a puzzle that will point the way. Take a metabolism with any one phenotype, such as viability on glucose, and ask, What if only one text in our library of more than 10
¹⁵⁰⁰metabolisms expressed its meaning? As many as five nonillion (5 × 10
³⁰) bacteria exist on earth today. This number is vast, a 1 with more than 30 zeroes. But even if each of these bacteria had tried a new enzyme combination every second since life began almost four billion years ago, they would have tried only about 10
⁴⁸such combinations.
³⁹Their chances of having found the one and only working combination would be vanishingly small, smaller than one in 10
¹⁴⁵⁰. This number—so small as to be effectively meaningless—means that it would be utterly impossible to find this text through a blind search.

On the one hand, the odds against finding just
one
useful metabolism are vast. On the other hand, life’s diversity shows that evolution had no problem finding it. This means that our premise must be wrong: There has to be more than one metabolism—perhaps even many—that solves the problem of surviving on glucose.

To find them, let’s do what evolution does: journey through the library and edit genomes—through a series of gene transfer or deletion events that add or eliminate at least one gene, enzyme, or reaction. The starting point for such a journey isn’t terribly important. It could be any text in the library, any text that encodes a metabolism viable on glucose or on any other fuel.

So let’s start with a metabolism viable on glucose, and either delete a randomly chosen reaction or add a randomly chosen reaction from the known reaction universe. Nature would make a simple and brutal evaluation of the new text: life or death. But we scientist travelers are privileged, because we can retrace our steps. We compute the meaning of the altered text, and, if it turns out not to be viable on glucose, return to the starting text, and add or delete another random reaction—remember, there are five thousand ways of doing that. But if the neighbor is viable on glucose, the journey continues. We add or delete a second reaction, compute the phenotype, and repeat, more or less ad infinitum
.

In other words, step from a starting text to its neighbor, to the neighbor’s neighbor, to the neighbor’s neighbor’s neighbor, and see how far you could walk without ever changing its chemical meaning, viability on glucose. Because each step alters a text at random, this walk is a
random walk
through the metabolic library, similar to how a drunkard might stagger home from a night out at the bar, with one difference: Each step in our random walk must encounter a text with the same meaning, the same phenotype.

If there were only one metabolism viable on glucose, this random walk would lead literally nowhere, because the starting text would have no viable neighbors. We would be rooted to the spot. The same would be true if there were a few such texts scattered widely through the library—we could not reach them without destroying viability on the way. And even if they were close together the random walk might not lead far. A few neighbors of the starting text might be viable, but
their
neighbors might not be.

Only if many such texts existed could we roam the library. But in that case we would face a different problem altogether: computing power. To compute one text’s meaning is a breeze, but what if this random walk had thousands of steps, and each could lead in thousands of different directions. This is the sort of problem that could take an off-the-shelf desktop computer years or decades to solve. An entire network of computers—a computing cluster—is required to speed up that computation. And that costs money.

While I was slowly advancing from a Ph.D. student to a postdoctoral researcher, and eventually to a tenured professor at a U.S. research university, funding for the kind of basic research that addresses the problem of evolutionary innovation began drying up. This drought combined with the ailing health of my European family, so when a job offer arrived from Switzerland, I was ready to take a leap across the Atlantic, back to my European roots.

I knew that Switzerland was a world leader in science, enormously productive, and technologically sophisticated.
⁴⁰Its world-class system of public education, generous support for academic research, and attractive living conditions are behind this success. I would be sad to leave many dear academic colleagues behind, but the opportunity to join the Swiss scientific community was a privilege both humbling and enticing. Most important, the offer was good enough to finance not only a computing cluster but also a state-of-the-art experimental laboratory. Even better, it would allow me to recruit multiple like-minded researchers from all over the world. It was an offer I could not refuse.

On a crisp fall day in 2006 at the University of Zürich, I was sitting in my newly furnished office, inside an austerely elegant building whose simple geometric contours are drawn in a gleaming blend of glass and metal, when a young Portuguese man walked in. Handsome, soft-spoken, with curious deep brown eyes and a quick smile, he introduced himself as João Rodrigues.

João had studied physics, but he had heard that there were many exciting problems waiting to be solved in biology. He was looking for a new challenge, a difficult problem to crack that would get him a Ph.D. He did not know much biology at the time, but he had assets that many biologists lacked: He was good at mathematics, knew how to program computers, and had already performed large and complicated computations. When I first saw his résumé, I could hardly contain my excitement. João had exactly the skills needed to navigate the vast metabolic library. During his job interview, I shared my passion about learning how nature creates. Fortunately for me, we connected. His eyes lit up. He signed on.

João’s background is typical for researchers in my lab. They hail from a dozen different countries in the Americas, Europe, Asia, and Australia, and from many disciplines, including biology, chemistry, physics, and mathematics. This is not a coincidence, because the problems we tackle require new skill combinations, so much so that I like to compare our work to that of evolution: Studying innovations, like creating them, benefits enormously from novel combinations—not of enzymatic but of intellectual skills.

I soon became impressed with João’s computing wizardry, even though I remained worried that the cluster of more than one hundred computers we had built would still be too slow, that we would never leave the first shelf of the library. But João tricked the machines into working faster, accelerated their computations many times, and eventually launched us far into the library’s vast stacks.

João’s exploration started with a single well-studied metabolism, that of the bacterium
E. coli
and its viability on glucose—the ability to synthesize all its sixty-odd essential biomass molecules from this single sugar.
⁴¹To find out whether only one metabolism with this ability existed, João first created more than a thousand of
E. coli
’s neighbors, each of them a metabolism that differs in a single chemical reaction from
E. coli
. If
E. coli’
s metabolism is an instruction manual to make all essential biomass molecules, then these neighbors are minor variations on the manual. The first question: Do any of them contain sufficient information to produce all sixty biomass building blocks from glucose?

Other books

Her Brother's Keeper - eARC by Mike Kupari

Weapon of Choice, A by Jennings, Jennifer L.

Bite of Envy (Just One Bite #4) by Glass, Kay

The Survivors (Book 1): Summer by Dreyer, V. L.

Unclean Spirits by M. L. N. Hanover

Passion (Lost Love Series) (Erotic Romance) by Lewis, Danielle

Fallen Angel by Elizabeth Thornton

Young Warriors by Tamora Pierce

Gathered Dust and Others by W. H. Pugmire

Encyclopedia Brown and the Case of the Secret Pitch by Donald J. Sobol