Arrival of the Fittest: Solving Evolution's Greatest Puzzle (24 page)

BOOK: Arrival of the Fittest: Solving Evolution's Greatest Puzzle
6.3Mb size Format: txt, pdf, ePub

Evolution explores this circuit library through the familiar crowd of randomly browsing readers, populations of organisms in which circuits get modified through occasional DNA copying errors that arise as genes get passed from parents to children with some corrupted letters. Any one such mutation can have two kinds of effects. It can deform a regulator’s shape and prevent it from recognizing DNA. Or it can alter one of the DNA “words” a regulator recognizes, either clipping a single wire of the circuit—actually disrupting the regulator’s effect on a gene—or creating a new wire, a new molecular word recognizable by some regulator.

The first kind of change often results in disaster, because each regulator affects so many other genes. Destroying a regulator’s ability to recognize DNA is akin to scrambling a complex recipe’s ingredients and thus destroying the entire dish. It can lead to organisms with terrible malformations or to embryos that die before they are born. The second sort of copying error, however, is more like a typo in a recipe. By changing the activity of merely one gene and the amount of protein it expresses—one among thousands of protein ingredients—it is less likely to cause serious damage. One might think that changes of this second kind are more tolerable and could thus steadily accumulate on evolutionary time scales. If so, they could slowly transform a circuit’s wiring diagram.

When one compares circuits that have evolved separately over millions of years, like those in some of the more than a thousand different fruit fly species, one finds indeed that most tolerable changes have occurred in the wires and not in the circuit genes themselves. Evolution alters most circuits one wire at a time, because messing with the circuit genes invites disaster. What is more, these small wiring changes indeed accumulate to transform circuits, and this process is far from slow.
47
The reason is that a regulator’s DNA keyword can be as short as five letters and occur thousands of letters away from a gene. By chance alone, random mutations can easily create new keywords and thus new wires in a circuit.
48

 

If only one and no other circuit in the hyperastronomical library of 10
700
texts expressed the code to create a specific innovation, evolution might as well pack up and go home, because this code would be a needle in a haystack many times the size of the universe.
49
The question of why it had not thrown in the towel had called to me since the early 1990s, but I had ignored it—too many other projects. My procrastination ended in 2004 when I spent a research sabbatical at the Institute for Higher Studies near Paris in France.
50

Set in a bucolic park with plenty of old trees, sculpted shrubbery, overflowing flowerbeds, and footpaths to explore while pondering life’s questions, the institute is a monastic refuge from the endless fund-raising, networking, and community service of an academic’s life. Its few resident researchers are highly decorated scientists, among them several recipients of the Fields Medal, widely known as the Nobel Prize for mathematicians. The institute focuses on mathematics and physics, but its leaders were aware that a seed long dormant in molecular biology, the insight that the whole is more than the sum of its parts, had germinated and flowered into an enormous subdiscipline known as
systems biology
. This emerging research field joins experimental data with mathematics and computation to find out how molecular parts like a fly’s regulators cooperate to shape whole biological systems, that is, organisms.
51
Mathematicians and physicists have many tools to crack problems like this, and so the institute invited biologists like me for extended visits to see what we could do together.

Luckily for me, I accepted the invitation. Because it was in Paris that I met Olivier Martin.

Olivier is an internationally respected professor of statistical physics at the University of Orsay near Paris. Statistical physicists like him deal with huge collections of things like the molecules in a pressurized container of propane gas, and how they create properties like the pressure of that gas. To predict this pressure is important—we don’t want our gas tanks to explode—but also impossibly complex, because trillions of molecules bounce into the container walls every instant. Statistical physicists love to think about wholes with trillions of parts—too many to track individually—and they develop clever ways to describe those wholes, employing sophisticated statistical methods that share little beyond the name with the statistics that pollsters use to predict the outcome of elections in the United States.
52

Olivier had a problem, though. Statistical physics is like a buffet where a hungry mob has devoured the choice dishes and left only small morsels behind: Most of its big questions are answered, and the remaining ones are either too hard or too trifling—not altogether surprising since scientists like James Clerk Maxwell and Ludwig Boltzmann had solved thermodynamical problems with statistical methods since the nineteenth century. Like most scientists in his situation, Olivier wanted to make a bigger contribution than physics would allow him. His problem was to find a question in systems biology that was new and challenging enough for him to chip away at.

I had a library of 10
700
regulation circuits to map. Boy, could I help Olivier out.

As Olivier Martin and I began to collaborate, I first came to appreciate him as a scientist whose intuition and technical skills prevented us from getting lost in the library. But he turned out to be much more than a surefooted travel companion. He was a kind and generous teacher who would patiently explain how the tools of his trade could help us find our way.
53

We started with small steps whose purpose was to answer a question you will recognize. Is there only one text in the circuit library that expresses any one meaning? To find out, we started out with a single circuit in the library and computed its expression code. Then we changed one wire, asked whether this mutation altered this expression phenotype, went back to our starting circuit, changed another wire, and so on, until we had created all neighbors of our circuit and knew their phenotypes. And to make sure that the neighborhood of this one circuit was not unusual, we explored the neighborhoods of many different starting circuits, circuits with different numbers of genes, different numbers of wires, a different arrangement of wires, and different phenotypes.

They all gave the same answer. Circuits typically have dozens to hundreds of neighbors with the same phenotype.
54
In other words, the phenotypes of these circuits remain unchanged even after encountering mutations that alter individual wires. They are not quite as delicate as those acrobat-formed human sculptures, where a body’s shifting by a few millimeters can spell disaster. Regulatory genotype circuits can tolerate such changes, because not all individual wires are critical to their function.

This first step away from a circuit already told us something very important: No one expression code—be it the one segmenting a fruit fly, dissecting a leaf, or shaping a vertebral column—has only one, special, unique circuit producing it. Each expression code can be produced by many circuits that differ in how their genes are wired. Finding out how many was trickier, because the number is so large that we could not even compute it, at least for circuits of forty or more genes. All we knew was that the number had to be enormous, since we had been able to calculate it for smaller circuits: Those with ten genes already had more than 10
40
circuits, and those with twenty genes had more than 10
160
circuits able to produce a given expression code. Producing any one expression code is another problem with more solutions than one can count.
55

To find out how far apart different solutions to the same gene expression problems are in the library, we took the same random walks we had used when investigating metabolisms and proteins. Starting with a circuit, we computed its expression code, altered a wire—adding or eliminating regulation of one gene—and thus stepped to a random neighbor with the same expression code, and from that to the neighbor’s neighbor, and so on, until we could not go further without changing the expression code.

Once again, we could walk almost all the way through the library. Circuits that differed in more than 90 percent of their wires could still produce the same expression code. Looking at their wiring diagram, you would never guess that one arose from the other in many tiny steps. Yet each one was a different solution to the same problem: how to produce a specific pattern of gene expression that can shape a cell’s identity.

To make sure that the starting circuit—and its expression code—was not unusual, we started to explore the library from many different shelves, circuits with different numbers of genes, different numbers of wires, different arrangement of wires, and different expression patterns. It did not make a big difference. Some circuits with the same expression code differed in every single wire, whereas others differed in “only” 75 percent of their wiring. But even these would not be recognizably related when examined side by side.

Our explorations also taught us that
all
circuits with the same expression code are typically connected in the library. We can start from any one of them, change one wire at a time, and transform the circuit step by step into
any
other circuit with the same meaning, such that each step leaves the meaning unchanged.
56
Once again, we could find a path from nearly every point in the library to nearly every other one, without ever getting stuck in a morass of regulatory nonsense.

All this means that circuits with the same phenotype form a vast network in the library of circuits, a
genotype network
like those we found in the metabolic library and the protein library. The library is filled with these networks, each of them containing more circuits than you can count, each of them reaching far through the library. All circuits in the same network are solutions to the same problem: how to produce a specific expression code that helps shape a cell, a tissue, or an organ. Small wonder that innovations like dissected leaves could evolve dozens of times independently, if vast numbers of circuits have the expression code that can get them there.

To map the millions of circuits needed to understand the library would have been impossible with any available technology aside from computation—hundreds of researchers had to experiment on millions of fruit flies over several decades to understand the single circuit segmenting a fly. However, some intrepid scientists are beginning to map circuits in simpler organisms, such as bacteria and yeasts. One of them is Mark Isalan, a researcher in Barcelona, who rewired a transcriptional regulation circuit in
E. coli
by adding new wires—regulation between pairs of genes—and created hundreds of circuits in its neighborhood. And he found, as we had, that regulation circuits are sturdy enough to be rewired.
57
Ninety-five percent of his rewired circuits function normally.

Other researchers compare regulator circuits among different species of the yeasts we use to brew beer, to see how far they have to travel through the circuit library. One such circuit activates genes that allow yeasts to digest the sugar galactose. You might think that there must be one
best
way to wire this circuit, and that the yeast species that has discovered this way would have passed it on, unchanged, to others. Not so. In two yeast species that split many million years ago, this circuit not only has become completely rewired but even uses different regulators.
58
Neither of these circuits is inferior, otherwise it would not have survived. Nature has solved the same regulation problem in two different but equally adequate ways. Not only that, but a path of small mutational steps connects these solutions, because the species shared a common ancestor.

Genes for the ribosome, the complex multiprotein machine that translates all RNA into proteins, tell the same story. A cell must manufacture its dozens of proteins in precisely balanced amounts, or else it will disappear like those wasteful
E. coli
cells overproducing beta-gal. Achieving this balance might seem a delicate affair with only one best solution, but again, two different species of yeast have come up with equally successful solutions that regulate these genes in completely different ways.
59

Examples like these show that organisms can indeed travel far through the circuit library. But when searching for rare nuggets of new and useful expression codes on their journey, they face a problem similar to that of innovating metabolisms and proteins: There are many trillion possible expression codes, but the immediate neighborhood of any one circuit contains at most thousands of other circuits—those differing in one wire—too few to find all possible expression codes nearby. To discover myriad new expression codes, evolving circuits need to venture out of their neighborhood. Such expeditions yield many discoveries only if
different
neighborhoods contain
different
expression codes. To find out whether they do, we asked our computers to draw two arbitrary circuits from the same genotype network—call them A and B, they produce the same expression code but have different wiring—identify all circuits near them, and compile a list of expression codes of all these circuits. We found that most expression phenotypes in the neighborhood of A are different from expression phenotypes in the neighborhood of B—regardless of A and B’s phenotype, number of genes, or wiring. Different neighborhoods contain different phenotypes.

Other books

The Gargoyle by Andrew Davidson
Gambit by Kim Knox
The Edge of Chaos by Koke, Jak
Trafficked by Kim Purcell
Drive Me Crazy by Eric Jerome Dickey
Vankara (Book 1) by West, S.J.
Black Iris by Leah Raeder