Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2011 Apr 8;88(4):469-81.
doi: 10.1016/j.ajhg.2011.03.013.

Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research

Affiliations
Comparative Study

Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research

Michael E Talkowski et al. Am J Hum Genet. .

Abstract

The contribution of balanced chromosomal rearrangements to complex disorders remains unclear because they are not detected routinely by genome-wide microarrays and clinical localization is imprecise. Failure to consider these events bypasses a potentially powerful complement to single nucleotide polymorphism and copy-number association approaches to complex disorders, where much of the heritability remains unexplained. To capitalize on this genetic resource, we have applied optimized sequencing and analysis strategies to test whether these potentially high-impact variants can be mapped at reasonable cost and throughput. By using a whole-genome multiplexing strategy, rearrangement breakpoints could be delineated at a fraction of the cost of standard sequencing. For rearrangements already mapped regionally by karyotyping and fluorescence in situ hybridization, a targeted approach enabled capture and sequencing of multiple breakpoints simultaneously. Importantly, this strategy permitted capture and unique alignment of up to 97% of repeat-masked sequences in the targeted regions. Genome-wide analyses estimate that only 3.7% of bases should be routinely omitted from genomic DNA capture experiments. Illustrating the power of these approaches, the rearrangement breakpoints were rapidly defined to base pair resolution and revealed unexpected sequence complexity, such as co-occurrence of inversion and translocation as an underlying feature of karyotypically balanced alterations. These findings have implications ranging from genome annotation to de novo assemblies and could enable sequencing screens for structural variations at a cost comparable to that of microarrays in standard clinical practice.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flow Diagram of Sequencing Approaches The flow diagram provides an overview of each of the four sequencing approaches taken, the library preparation method, targeted fragment size, and each subject sequenced under a given approach. In sum, we applied three different whole-genome approaches and a CapBP approach to identify balanced rearrangement breakpoints from paired-end sequencing.
Figure 2
Figure 2
Translocation Sequencing Results for Subject 1 Sequencing eight lanes generated 207.2 million read pairs, yielding 10.2× physical coverage of all nucleotides after alignment of 91% of reads. Translocation breakpoints for each derivative chromosome were resolved to base pair resolution with 20 supporting read pairs, including 12 gap reads straddling the translocation breakpoint and 8 split reads crossing the translocation breakpoint sequence.
Figure 3
Figure 3
Insert Size Distributions for Large Insert Library Methods The distribution of insert sizes for subject 3 (left panel in red), created by using the published Illumina Mate-Pair kit with initial fragments size selected at approximately 3 kb. The figure shows a bimodal distribution typical of this technique, representing fragments that cross the circularization junction (outward facing reads) and fragments of contiguous DNA that were biotinylated and retained but do not cross the circularization junction (inward facing reads). In this subject, 77.8% of all reads were outward facing. For subject 4 (not shown) only 45.4% of all aligned pairs were separated by large inserts. The proportion of outward facing reads can vary substantially based on a number of factors, including DNA quality. The insert size distribution for subject 5a (right panel in blue), created by our custom method based on the mate-pair method for SOLiD sequencing (Applied Biosystems) with modifications including insertion of a 6 base subject specific barcode. The method resulted in 99.3% of all aligned read pairs being separated by large inserts for this subject.
Figure 4
Figure 4
Coverage of Targeted Regions in CapBP (A) Overview of coverage for the targeted capture experiment in each of the regions. For all regions, the percentage masked represents the percentage of bases annotated as repeat sequence in the Agilent Sure Select pipeline based on RepeatMasker; percentage masked aligned is the percentage of those repeat-masked sequences we were able to align uniquely, and percentage unmasked aligned is the percentage of bases not denoted as repeat masked that were uniquely covered by sequencing reads. (B) Representative coverage for one of the samples provided in the UCSC browser (subject 11, regions 1 and 2). See Figure S2 for complete details of all subjects.
Figure 5
Figure 5
Theoretical and Empirical Coverage of Genomic Regions Analysis was performed to predict capture success in a given region. (A) provides representation of the sequence composition across all targeted regions in the CapBP experiment and (B) shows the composition of all bases that could not be uniquely aligned, indicating that capture and unique alignment was most challenging for LINE and SINE elements. In (C) the fraction of all captured bases is represented on the y axis for each type of repetitive element and blue shading indicates the proportion of bases that were uniquely aligned for each type. (D) A theoretical prediction of capture performance across each chromosome based on uniquely aligning all possible 75mers with two errors or less. Blue bars indicate the proportion of unaligned bases that could be recovered by a paired-end strategy in which one of the two ends could be uniquely aligned, allowing unambiguous placement of the read pair. (E) Theoretical proportion of all bases in the genome that would not be covered by either unique alignment of paired-end 75 cycle sequenced bases or the insert between paired reads if large insert sequencing was performed with varying insert sizes.

Similar articles

Cited by

References

    1. Rowley J.D. Chromosome abnormalities in leukemia. Haematol. Blood Transfus. 1979;23:43–52. - PubMed
    1. Korbel J.O., Urban A.E., Affourtit J.P., Godwin B., Grubert F., Simons J.F., Kim P.M., Palejev D., Carriero N.J., Du L. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007;318:420–426. - PMC - PubMed
    1. Campbell P.J., Stephens P.J., Pleasance E.D., O'Meara S., Li H., Santarius T., Stebbings L.A., Leroy C., Edkins S., Hardy C. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 2008;40:722–729. - PMC - PubMed
    1. Chen W., Kalscheuer V., Tzschach A., Menzel C., Ullmann R., Schulz M.H., Erdogan F., Li N., Kijas Z., Arkesteijn G. Mapping translocation breakpoints by next-generation sequencing. Genome Res. 2008;18:1143–1149. - PMC - PubMed
    1. Lee H., O'Connor B.D., Merriman B., Funari V.A., Homer N., Chen Z., Cohn D.H., Nelson S.F. Improving the efficiency of genomic loci capture using oligonucleotide arrays for high throughput resequencing. BMC Genomics. 2009;10:646. - PMC - PubMed

Publication types

LinkOut - more resources