We discussed whether we had been too hasty in publishing the proof-of-principle data in
Nature.
Had we been driven to go ahead by the competition with Eddy? Should we have waited? Some in the group thought so, and others did not. Even in retrospect I felt that the only direct evidence for contamination we possessed, the analysis of the mtDNA, had shown that contamination was low. And that was still the case. The mtDNA analysis had its limitations, but in my opinion, direct evidence should always have precedence over indirect inferences. In the note that
Nature
never published, we therefore said that “no tests for contamination based on nuclear sequences are known, but in order to have reliable nuclear sequences from ancient DNA, these will have to be developed.” This remained an ongoing theme in our Friday meetings for the next several months.
______________________________
Once we knew that we could make the DNA libraries we needed, and with the hope that 454 would soon have fast enough machines to sequence them all, we started turning our attention to the next challenge: mapping. This was the process of matching the short Neanderthal DNA fragments to the human genome reference sequence. This process might sound easy, but in fact it would prove a monumental task, much like doing a giant jigsaw puzzle with many missing pieces, many damaged pieces, and lots and lots of extra pieces that would fit nowhere in the puzzle.
At heart, mapping required that we balance our responses to two different problems. On the one hand, if we required near-exact matches between the Neanderthal DNA fragments and the human genome, we might miss fragments that carried more than one or two real differences (or errors). This would make the Neanderthal genome look more similar to present-day humans than it really was. But if our match criteria were too permissive, we might end up allowing bacterial DNA fragments that carried spurious similarity to some parts of the human genome to be misattributed as Neanderthal DNA. This would make the Neanderthal genome look more different from present-day humans than in reality. Getting this balance right was the single most crucial step in the analysis because it would influence all later work that revolved around scoring differences from present-day genomes.
We also had to keep practical considerations in mind. The computer algorithms used for mapping could not take too many parameters into account, as it then became impossible to efficiently compare the more than 1 billion DNA fragments each composed of 30 to 70 nucleotides that we planned to sequence from the Neanderthal bones to the 3 billion nucleotides in the human genome.
The people who took on the monumental task of designing an algorithm to map the DNA fragments were Ed Green, Janet Kelso, and Udo Stenzel. Janet had joined us to head a bioinformatics group in 2004 from University of the Western Cape, in her native South Africa. An unassuming but effective leader, she was able to form a cohesive team out of the quirky personalities that made up the bioinformatics group. One of these personalities was Udo, who had a misanthropic streak; convinced that most people, especially those higher up in academic hierarchies, were pompous fools, he had dropped out of university before finishing his degree in informatics. Nevertheless, he was probably more capable as a programmer and logical thinker than most of his teachers. I was happy that he found the Neanderthal project worthy of his attention even if his conviction that he always knew everything best could drive me mad at times. In fact, Udo would probably not have gotten along with me at all if it were not for Janet’s mediating influence.
Ed, with his original project on RNA splicing having died a quiet, unmourned death, had become the de facto coordinator of the efforts to map the Neanderthal DNA fragments. He and Udo developed a mapping algorithm that took the patterns of errors in the Neanderthal DNA sequences into account. These patterns had in the meantime been worked out by Adrian together with Philip Johnson, a brilliant student in Monty Slatkin’s group at Berkeley. They had found that errors were located primarily toward the ends of the DNA strands. This was because when a DNA molecule is broken, the two strands are often different lengths, leaving the longer strand dangling loose and vulnerable to chemical attack. Adrian’s detailed analysis had also revealed that, contrary to our conclusions just a year earlier, the errors were all due to deamination of cytosine residues, not adenine residues. In fact, when a C occurred at the very end of a DNA strand, it had a 20 to 30 percent risk of appearing as a T in our sequences.
Ed’s mapping algorithm cleverly implemented Adrian’s and Philip’s model of how errors occurred as position-dependent error probabilities. For example, if a Neanderthal molecule had a T at the end position and the human genome a C, this was counted as almost a perfect match, as deamination-induced C-to-T errors at the end positions of Neanderthal fragments were so common. In contrast, a C in the Neanderthal molecule and a T in the human genome at the end position was counted as a full mismatch. We were confident that Ed’s algorithm would be a great advance in reducing false mapping of fragments and increasing correct ones.
Another problem was choosing a comparison genome to use for mapping the Neanderthal fragments. One of the goals of our research was to examine whether the Neanderthal genome sequence revealed a closer relationship with humans in Europe than in other parts of the world. For example, if we mapped the Neanderthal DNA fragments to a European genome (about half of the standard reference genome was from a person of European descent), then fragments that matched the European genome might be retained more often than fragments that were more like African genomes. This would make the Neanderthal incorrectly look more similar to Europeans than to Africans. We needed a neutral comparison, and we found one in the chimpanzee genome. The common ancestor that Neanderthals and modern humans shared with chimpanzees existed perhaps 4 million to 7 million years ago, meaning that the chimpanzee genome should be equally unlike both the Neanderthal and the modern human genome. We also mapped the Neanderthal DNA fragments to an imaginary genome that others had constructed by estimating what the genome of the common ancestor of humans and chimpanzees would have looked like. After being mapped to these more distant genomes, the Neanderthal fragments could then be compared to the corresponding DNA sequences in present-day genomes from different parts of the world and differences could be scored in a way that did not bias the results from the outset.
All of this required considerable computational power, and we were fortunate to have the unwavering support of the Max Planck Society as we attempted it. The society dedicated a cluster of 256 powerful computers at its computer facility in southern Germany exclusively to our project. Even with these computers at our disposal, mapping a single run from the sequencing machines took days. To map all our data would take months. The crucial task that Udo took on was how to more efficiently distribute the work to these computers. Since Udo was deeply convinced that no one could do it as well as he could, he wanted to do all of the work himself. I had to cultivate patience while awaiting his progress.
When Ed looked at the mapping of the first batches of new DNA sequences that came back to us from Branford, he discovered a worrying pattern that set off alarm bells in the group and made my heart sink: the shorter fragments showed more differences from the human genome than the longer fragments! It was reminiscent of one of the patterns that Graham Coop, Eddy Rubin, and Jeff Wall had seen in our
Nature
data. They had interpreted that pattern as contamination, assuming that the longer fragments showed fewer differences from present-day humans because many of them were in fact recent human DNA contaminating our libraries. We had hoped that preparing the libraries in our clean room and using our special TGAC tags would spare us the plague of contamination. Ed began frantic work to see if we after all had modern human contamination in our sequencing libraries.
Happily he found that we didn’t. Ed quickly saw that if he made the cut-off criteria for a match more stringent, then the short and long fragments became just as different from the reference genome. Ed could show that whenever we (and Wall and the others) had used the cut-off values routinely used by genome scientists, short bacterial DNA fragments were mistakenly matched to the human reference genome. This made the short fragments look more different than long fragments from the reference genome; when he increased the quality cut-off, the problem went away, and I felt secretly justified in my distrust of contamination estimates based on comparisons between short and long fragments.
But soon after this, our alarm bells went off again. This time the issue was even more convoluted and took me quite a while to understand—so please bear with me. One consequence of normal human genetic variation is that a comparison of any two versions of the same human chromosome reveals roughly one sequence difference in every thousand nucleotides, those differences being the result of mutations in previous generations. So whenever two different nucleotides (or alleles as geneticists will say) occur at a certain position in a comparison of two chromosomes, we can ask which of the two is the older one (or the “ancestral allele”) and which is the more recent one (or the “derived allele”). Fortunately, it is possible to figure this out easily by checking which nucleotide appears in the genomes of chimpanzees and other apes. That allele is the one that is likely to have been present in the common ancestor we shared with the apes, and it is therefore the ancestral one.
We were interested in seeing how often the Neanderthals carried recent, derived alleles that are also seen among present-day humans, as this would allow us to estimate when Neanderthal ancestors split from modern human ancestors. Essentially, more derived alleles shared by modern humans and Neanderthals means that the two lines diverged more recently. During the summer of 2007, Ed looked at our new data from 454 Life Sciences and he was alarmed. Just as observed by Wall and others in the smaller test data set published in 2006, Ed saw that the longer Neanderthal DNA fragments—those of more than 50 or so nucleotides—carried more derived alleles than shorter ones. This suggested that the longer fragments were more closely related to present-day human DNA than the shorter ones, a paradoxical finding that once again could have been the result of contamination.
Like many of the crises before it, this one dominated our Friday meetings. For weeks we discussed it endlessly, suggesting one possible explanation after another, none of which led us anywhere. In the end, I lost my patience and suggested that maybe we
did
have contamination, that maybe we should just give up and admit that we could not produce a reliable Neanderthal genome. I was at my wit’s end, feeling like crying like a child. I did not, but I think many in the group realized it was a real crisis nonetheless. Perhaps this gave them new energy. I noticed that Ed looked as though he had not slept at all for a few weeks. Finally he was able to puzzle it out.
Recall that a derived allele starts out as a mutation in a single individual—a fact that, by definition, makes derived alleles rare. Examined in aggregate, one person’s genome will show derived alleles at about 35 percent of the positions that vary whereas about 65 percent will carry ancestral alleles. Ed’s breakthrough came when he realized that this meant that when a Neanderthal DNA fragment carried a derived allele, it would differ from the human genome reference sequence 65 percent of the time and match it only 35 percent of the time. This, in turn, meant that a Neanderthal DNA fragment was more likely to match the correct position if it carried the ancestral allele! He also realized that short fragments with a difference to the human genome would more often go unrecognized by the mapping programs than longer fragments, because the longer fragments naturally had many more matching positions that allowed them to be correctly mapped even if they carried a difference or two. As a result, shorter fragments with derived alleles would more often be thrown out by the mapping program than longer ones, and short fragments would therefore incorrectly seem to carry fewer derived alleles than longer fragments. Ed had to explain this to me many times before I understood it. Even so, I did not trust my intuition and hoped that he could prove to us in some direct way that his idea was correct.
I guess Ed did not want to see me cry in the meeting, so in the end he came up with a clever experiment that proved the point. He simply took the longer DNA fragments he had mapped and cut them in half in the computer, so that they were now half as long. He then mapped them again. Like magic, the frequency with which they carried derived alleles decreased when compared to the longer ones from which they were generated (see Figure 14.1). This was because many of the fragments that carried derived alleles could not be mapped when they were shorter. Finally, we had an explanation for the pattern of apparent contamination in our data! At least some of the patterns of contamination seen in the original test data published in
Nature
could also now be explained. I quietly let out a sigh of relief when Ed presented his experiment. We published these insights in a highly technical paper in 2009.
{55}
Ed’s findings reinforced my conviction that direct assays for contamination were necessary, and our Friday discussions again and again came back to how we could measure nuclear DNA contamination. But now I was somewhat more relaxed when these discussions came up. I felt convinced that we were on the right track.
________________________________
By early 2008, the people at 454 Life Sciences in Connecticut had performed 147 runs from the nine libraries we had prepared from the Vi-33.16 bone, yielding 39 million sequence fragments. This was a lot but still not as much as I had hoped to have by this time, and certainly far too little to make it worthwhile to begin to reconstruct the nuclear genome. Nevertheless, I was keen to test the mapping algorithms, so we undertook the much less formidable task of reconstructing the mitochondrial genome. All we or anyone else had done by that point was sequence some 800 nucleotides of the most variable parts of Neanderthal mtDNA. Now we wanted to do all 16,500 nucleotides.