Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 1;6(11):1-7.
doi: 10.1093/gigascience/gix097.

The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

Affiliations

The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

Aleksey V Zimin et al. Gigascience. .

Abstract

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15 344 693 583 bases and has a weighted average (N50) contig size of 232 659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4 179 762 575 bp of T. aestivum that correspond to its D genome components.

Keywords: PacBio sequencing; genome assembly; hybrid assembly; plant genomes; wheat genome.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Illustration of the merging process for the Triticum 2.0 and FALCON Trit 1.0 assemblies. If 2 contigs A and B from the FALCON assembly overlapped a Triticum 2.0 contig by at least 5000 bp, then A and B were merged together using the Triticum 2.0 contig to fill the gap.
Figure 2:
Figure 2:
K-mer uniqueness ratios for the wheat genome (Triticum aestivum) compared to the cow, fruit fly, rice, loblolly pine, and Ae. tauschii genomes. The plot shows the percentage of each genome that is covered (y-axis) by unique sequences of length k for various values of k (x-axis).
Figure 3:
Figure 3:
Missing 31-mers in the different assemblies of Triticum aestivum. Using the Illumina read data from a previously published assembly of the same genome, we counted all 31-mers in the reads and then plotted how many of these 31-mers are missing from each assembly. The x-axis shows how often the k-mers occur in the reads. The y-axis shows how many distinct k-mers are missing from each assembly. The FALCON Trit 1.0 assembly had the most missing k-mers, while the MaSuRCA-driven Triticum 2.0 assembly had the fewest.

Similar articles

Cited by

References

    1. Brenchley R, Spannagl M, Pfeifer M et al. . Analysis of the bread wheat genome using whole-genome shotgun sequencing. Nature 2012;491(7426):705–10. - PMC - PubMed
    1. International Wheat Genome Sequencing Consortium A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome. Science 2014;345(6194):1251788. - PubMed
    1. Clavijo BJ, Venturini L, Schudoma C et al. . An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. Genome Res 2017;27(5):885–96. - PMC - PubMed
    1. Li W, Zhang P, Fellers JP et al. . Sequence composition, organization, and evolution of the core Triticeae genome. Plant J 2004;40(4):500–11. - PubMed
    1. Arumuganathan K, Earle ED. Nuclear DNA content of some important plant species. Plant Mol Biol Rep 1991;9(3):208–18.

Publication types