Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2002;3(12):RESEARCH0079.
doi: 10.1186/gb-2002-3-12-research0079. Epub 2002 Dec 23.

Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence

Affiliations
Comparative Study

Finishing a whole-genome shotgun: release 3 of the Drosophila melanogaster euchromatic genome sequence

Susan E Celniker et al. Genome Biol. 2002.

Abstract

Background: The Drosophila melanogaster genome was the first metazoan genome to have been sequenced by the whole-genome shotgun (WGS) method. Two issues relating to this achievement were widely debated in the genomics community: how correct is the sequence with respect to base-pair (bp) accuracy and frequency of assembly errors? And, how difficult is it to bring a WGS sequence to the accepted standard for finished sequence? We are now in a position to answer these questions.

Results: Our finishing process was designed to close gaps, improve sequence quality and validate the assembly. Sequence traces derived from the WGS and draft sequencing of individual bacterial artificial chromosomes (BACs) were assembled into BAC-sized segments. These segments were brought to high quality, and then joined to constitute the sequence of each chromosome arm. Overall assembly was verified by comparison to a physical map of fingerprinted BAC clones. In the current version of the 116.9 Mb euchromatic genome, called Release 3, the six euchromatic chromosome arms are represented by 13 scaffolds with a total of 37 sequence gaps. We compared Release 3 to Release 2; in autosomal regions of unique sequence, the error rate of Release 2 was one in 20,000 bp.

Conclusions: The WGS strategy can efficiently produce a high-quality sequence of a metazoan genome while generating the reagents required for sequence finishing. However, the initial method of repeat assembly was flawed. The sequence we report here, Release 3, is a reliable resource for molecular genetic experimentation and computational analysis.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Status of the Drosophila melanogaster euchromatic genome. Each chromosome arm is represented by a black horizontal line with a circle indicating its centromere. For each arm, seven tiers of information (A-G) are presented. (A) Each vertical green line represents the position of a transposable element. (B) Each vertical blue line represents the position of a 'declared' gap in Release 2. (C) Each vertical red line represents the position of an 'undeclared' gap in Release 2 greater than 20 bp, detected by comparing the Release 2 and Release 3 sequences. (D) Each vertical black line represents the position of a sequence gap that remains in Release 3. (E) The horizontal bars depict the regions of the genome assigned to LBNL (blue) or the HGSC, Baylor College of Medicine (brown) for generating Release 3. (F) The gray horizontal bar represents the status of the physical maps that supplied the initial BAC tiling paths for sequencing; presence of the gray bar indicates an available BAC contig. The sources of these BAC maps were as follows: chromosome X [12,50], chromosome arms 2L, 2R, 3L, and 3R [11] and chromosome 4 [13]. The black triangles represent the seven physical map gaps remaining in the euchromatic portion of the genome in Release 3. (G) The purple bar represents the position of cosmid, P1 or BAC clones that had been completely sequenced prior to Release 2. Those at the telomere of chromosome X were sequenced by the EDGP [51]; the other clones were sequenced by the BDGP at LBNL [1]. The numbers to the left of rows A, B, C and D are the chromosome arm totals for each category plotted. The scale in million bases (Mb) is shown at the bottom of the figure.

Similar articles

Cited by

References

    1. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. - PubMed
    1. Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA, et al. A whole-genome assembly of Drosophila. Science. 2000;287:2196–2203. - PubMed
    1. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science. 2001;291:1304–1351. - PubMed
    1. Mural RJ, Adams MD, Myers EW, Smith HO, Miklos GL, Wides R, Halpern A, Li PW, Sutton GG, Nadeau J, et al. A comparison of whole-genome shotgun-derived mouse chromosome 16 and the human genome. Science. 2002;296:1661–1671. - PubMed
    1. Hoskins RA, Smith CD, Carlson J, Carvalho BA, Halpern A, Kennedy C, Kaminker JS, Mungall C, Sullivan BA, Sutton G, et al. Heterochromatic sequences in a Drosophila whole genome shotgun assembly. Genome Biol. 2002;3:research0085.1–0085.16. - PMC - PubMed

Publication types