Ancient DNA: Methods and Protocols (33 page)

BOOK: Ancient DNA: Methods and Protocols
4.69Mb size Format: txt, pdf, ePub

Next, we converted one multiplex PCR for each primer set for each sample (two libraries per sample) into a barcoded sequencing library using the barcoding “protocol 2” for preamplifi ed DNA as described in Chapter 19 (
( 11
)
; Fig. 1
). This approach directly couples the barcoding protocol and the library preparation process by including the barcode sequence in the adapter sequence. We then quantifi ed all libraries using quantitative PCR (qPCR). According to the qPCR results, we pooled the libraries in equimolar ratios and sequenced them simultaneously on a small (1/16th) lane of a 454 FLX sequencing plate. After sequencing, we sorted all of the obtained reads according to their barcode sequence, in this case the fi rst seven bases of the sequencing reads. In the ideal scenario,

Fig. 1. Schematic overview of the combined protocol coupling fi rst-step multiplex PCR presented in Chapter 17 directly to barcoding protocol 2 of Chapter 19.

174

M. Stiller

all barcodes would be represented evenly in terms of number of reads. Errors introduced during upstream steps, such as incorrect quantifi cation in qPCR, errors in the dilution steps of the pooling procedure, or simple pipetting errors, can, however, result in an under-or overrepresentation of barcodes in the fi nal sequencing output.

After verifying a fairly balanced representation of barcodes in the library pool and a suffi cient enrichment of the target fragments, we sequenced four multiplex PCRs (the “odd” and “even” sets in replicates, respectively) for each of the selected cave bear samples on a full 454 FLX run. We then performed a second round of multiplex PCR in order to fi ll remaining gaps in the cave bear mitochondrial genome sequences. In this multiplex PCR, however, the primer sets contained only those primer pairs that fl anked missing sequence data. To compensate for the reduced number of targeted fragments and to ensure amplifi cation of the target fragments above the environmental DNA background, we increased the

number of cycles in the PCR from 20 to 25.

3. Results

 

and Discussion

Fifty-six of the one hundred and ten cave bear specimens tested showed suffi ciently well-preserved DNA to be used in multiplex PCR. After 20 cycles, the reactions were converted into barcoded sequencing libraries, quantifi ed and pooled in equimolar ratios, and sequenced on a small (1/16th) 454 FLX lane. Analysis of these initial sequencing results revealed differences between the samples, either in DNA preservation or in the amount of contamination with exogenous DNA (e.g., fungal and bacterial DNA). The proportion of sequence reads that matched the target fragments varied widely among the 56 specimens used, from 1% to 100%.

As only 1% of reads matching target fragments is insuffi cient to compile a consensus sequence, we continued to process only those samples that were the best preserved. We applied an arbitrary cutoff in which we required at least 40% of the sequencing reads to have matched the targeted fragments in order to keep a sample in the experiment. Instead of applying this cut-off, one could have chosen to re-amplify the more poorly performing samples (those showing low amounts of endogenous DNA and/or high levels of contamination with exogenous DNA), this time increasing the number of PCR cycles to 25 or up to 30 cycles. Note that increasing the number of cycles will also increase the uneven representation among the target fragments in the reaction, due to differences in amplifi cation effi ciency among primer pairs. Too few cycles, however, may be insuffi cient to enrich for the target fragments over the environmental background DNA. It is therefore highly 20 Case Study: Targeted high-Throughput Sequencing…

175

recommended to determine the ratio of reads matching target fragments to reads matching environmental background DNA

prior to fi nal deep sequencing.

In this case, we continued to process 31 of the 56 samples that met our preservation criterion. Based on the obtained output, 112

of the 128 target fragments were covered by sequencing reads on average among the 31 samples. Thus, based on only one full run of the 454 FLX instrument, on average 87% of the mitochondrial genome was obtained from 31 individuals, representing more than 7 kilobases (kb) of replicated, overlapping sequence from all of the 31 individuals. With only one more round of gap fi lling, on average 96% of the mitochondrial genomes were covered, translating into ~10 kb of overlapping sequence from all individuals.

Phylogenetic analyses of the consensus sequences revealed a stable topology with very high statistical support, indicating strong evidence for the reciprocal monophyly of the three cave bear lineages
( 4 )
.

DMPS has also been used successfully in experiments to amplify whole mitochondrial genomes from a modern polar bear and a fossil mammoth, as well as to amplify multiple nuclear loci from a modern African elephant
( 4 )
. In addition to using different primer sets designed for the respective species and target loci, the only other modifi cation to the protocol described above was, when modern samples were used, to lower the number of PCR cycles from 20 to 15.

These results show that no extensive optimization of primer sets is necessary to successfully apply DMPS to ancient or modern DNA sequencing experiments. Further, DMPS, like traditional PCR, of
fers full single-molecule sensitivity ( 10
) , as no pretreatment of the aDNA extract (e.g. library preparation) is necessary prior to amplifi cation. The protocol is therefore an easy-to-implement, robust, and cost-effi cient way to quickly retrieve many kb of homologous sequence data from large numbers of highly degraded samples, such as fossil remains and poorly preserved samples from museum, forensic, and medical collections.

Acknowledgments

I thank M Meyer and M Hofreiter for help throughout the research project; B Hoeffner and A Aximu for running the 454 sequencer; G Baryshnikov, H Bocherens, A Grandal d’Anglade, B Hilpert, T Kutznetsova, S Münzel, R Pinhasi, G Rabeder, W Rosendahl, and E Trinkaus for providing samples; K Finstermeier for help with the fi gure and the Max Planck Society and National Science Foundation (award ANS-0909456) for fi nancial support.

176

M. Stiller

References

1. Bon C, Caudy N, de Dieuleveult M, Fosse P,

deep divergences and complex phylogeographic

Philippe M, Maksud F, Beraud-Colomb E,

patterns. Mol Ecol 18:1225–1238

Bouzaid E, Kefi R, Laugier C, Rousseau B, 7. Hofreiter M, Rabeder G, Jaenicke-Despres V, Casane D, van der Plicht J, Elalouf JM (2008)

Withalm G, Nagel D, Paunovic M, Jambresic

Deciphering the complete mitochondrial

G, Pääbo S (2004) Evidence for reproductive

genome and phylogeny of the extinct cave bear

isolation between cave bear populations. Curr

in the Paleolithic painted cave of Chauvet. Proc

Biol 14:40–43

Natl Acad Sci U S A 105:17447–17452

8. Rabeder G, Hofreiter M, Withalm G (2004)

2. Binladen J, Gilbert MT, Bollback JP, Panitz F,

The systematic position of the Cave Bear from

Bendixen C, Nielsen R, Willerslev E (2007)

Potocka zijalka (Slovenia). Mitt Komm

The use of coded PCR primers enables high—

Quartärforsch Österr Akad Wiss 13:197–200

throughput sequencing of multiple homolog 9. Rohland N, Hofreiter M (2007) Ancient DNA amplifi cation products by 454 parallel sequenc—

extraction from bones and teeth. Nat Protoc

ing. PLoS One 2:e197

2:1756–1762

3. Meyer M, Stenzel U, Hofreiter M (2008) 10. Dear PH, Cook PR (1993) Happy mapping: Parallel tagged sequencing on the 454 plat—

linkage mapping using a physical analogue of

form. Nat Protoc 3:267–278

meiosis. Nucleic Acids Res 21:13–20

4. Stiller M, Knapp M, Stenzel U, Hofreiter M,

11. Knapp M, Stiller M, Meyer M (2011)

Meyer M (2009) Direct multiplex sequenc—

Generating barcoded libraries for multiplex

ing (DMPS)—a novel method for targeted

high-throughput sequencing. In: Shapiro B,

high-throughput sequencing of ancient and

Hofreiter M (eds) Ancient DNA. Springer,

highly degraded DNA. Genome Res 19:

New York

1843–1848

12. Fulton TL, Stiller M (2011) PCR amplifi ca—

5. Pacher M, Stuart AJ (2009) Extinction chro—

tion, cloning and sequencing of ancient DNA.

nology and palaeobiology of the cave bear

In: Shapiro B, Hofreiter M (eds) Ancient DNA.

(
Ursus spelaeus
). Boreas 38:189–206

Springer, New York

6. Knapp M, Rohland N, Weinstock J, Baryshnikov

13. Stiller M, Fulton TL (2011) Multiplex PCR

G, Sher A, Nagel D, Rabeder G, Pinhasi R,

amplifi cation of ancient DNA. In: Shapiro B,

Schmidt HA, Hofreiter M (2009) First DNA

Hofreiter M (eds) Ancient DNA. Springer,

sequences from Asian cave bear fossils reveal

New York

Chapter 21

Target Enrichment via DNA Hybridization Capture

Susanne Horn

Abstract

Recent advances in high-throughput DNA sequencing technologies have allowed entire nuclear genomes to be shotgun sequenced from ancient DNA (aDNA) extracts. Nonetheless, targeted analyses of specifi c genomic loci will remain an important tool for future aDNA studies. DNA capture via hybridization allows the effi cient exploitation of current high-throughput sequencing for population genetic analyses using aDNA samples. Specifi cally, hybridization capture allows larger data sets to be generated for multiple target loci as well as for multiple samples in parallel. “Bait” molecules are used to select target regions from DNA libraries for sequencing. Here we present a brief overview of the currently available hybridization capture protocols using either an in-solution or a solid-phase (immobilized) approach. While it is possible to purchase ready-made kits for this purpose, I present a protocol that allows users to generate their own custom bait to be used for hybridization capture.

Key words:
Ancient DNA , Target enrichment , Hybridization , DNA capture , Bait , High-throughput sequencing

1. Introduction

 

Shotgun sequencing using next-generation sequencing techniques has been used to sequence entire genomes of ancient specimens
( 1– 3 )
. However, this approach remains prohibitively expensive for many users, and generally provides data from only a single specimen. Analyses of ancient populations generally do not focus on complete genome sequences, but instead on selected genomic loci that can be targeted from many individuals.

In many ancient DNA (aDNA) extracts, DNA fragments representing the target loci are present at very low copy-number compared to sequences of contaminating exogenous DNA. Such experiments therefore require an enrichment step, where the amount of target DNA is increased in a library to be sequenced, relative to nontarget DNA. Enrichment is most often achieved via Beth Shapiro and Michael Hofreiter (eds.),
Ancient DNA: Methods and Protocols
, Methods in Molecular Biology, vol. 840, DOI 10.1007/978-1-61779-516-9_21, © Springer Science+Business Media, LLC 2012

177

178

S. Horn

polymerase chain reaction (PCR). This approach, however, is currently being superseded by enrichment strategies that capture DNA by hybridization
( 4– 7 )
. In hybridization capture approaches, a genomic library is fi rst prepared from an aDNA extract and DNA bait molecules representing the target sequence are added to the library. The target DNA molecules in the library will hybridize with the added bait molecules and can then be pulled down out of the library for sequencing. DNA hybridization capture has several advantages compared to traditional PCR. First, while mismatches can prevent the binding of primers in PCR, mismatches are less detrimental for hybridization, making hybridization a useful method to enrich for DNA where the sequence of the ancient specimen is not exactly known. This can also be important when molecules with damage-induced base modifi cations may inhibit primer binding
( 8, 9
) . Second, hybridization is less sensitive to contamination than traditional PCR. While PCR selects for full-length amplicons and therefore tends to amplify longer molecules preferentially (which may be modern DNA contaminants), hybridization targets all lengths of starting molecules more equally. Third, nuclear mitochondrial insertions (numts) may be amplifi ed preferentially by PCR if the primer binding conditions allow. Hybridization, however, should preferentially enrich for the most common fragment, which will be the much higher copy-number mitochondrial sequence. One potential drawback of hybridization capture is the loss of target molecules during library preparation. This is not a problem for PCR, which is theoretically able to begin the amplifi -

cation process from a single starting molecule. Therefore, it is highly recommended that not all of the aDNA extract is used in a single enrichment experiment, but that some is saved for replication if necessary.

The choice of sequencing platform will determine what type of library will need to be prepared prior to enrichment (see T
able 1 ).

This choice may depend on the size of the sequence fragment to be targeted and the number of samples to be processed. Hybridization capture can be used to enrich for fragments ranging in length from a few hundred bases to many megabases (Mb) in size. When the sequencing is complete, only a fraction of the sequencing reads will map to the desired target region, and this also needs to be considered when planning the amount of sequence data that will be required. In previous work, enrichment rates for aDNA varied considerably across experiments: between 18 and 40% of reads could be mapped to a target region of a Neandertal mitochondrial genome
( 10
) ; 37% of reads mapped to targeted nuclear regions of Neander
tals ( 7
) ; and around 20% of reads mapped to a targeted 500-base-pair (bp) region of the mitochondrial control region of beavers (

Other books

Truancy Origins by Isamu Fukui
The Dark Side of Disney by Leonard Kinsey
El tiempo envejece deprisa by Antonio Tabucchi
Black Bazaar by Alain Mabanckou
Canción de Nueva York by Laura Connors
Icing on the Lake by Catherine Clark