Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Aug 8;9(8):2377-2393.
doi: 10.1534/g3.119.400215.

De Novo Genome Sequence Assembly of Dwarf Coconut (Cocos nucifera L. 'Catigan Green Dwarf') Provides Insights into Genomic Variation Between Coconut Types and Related Palm Species

Affiliations

De Novo Genome Sequence Assembly of Dwarf Coconut (Cocos nucifera L. 'Catigan Green Dwarf') Provides Insights into Genomic Variation Between Coconut Types and Related Palm Species

Darlon V Lantican et al. G3 (Bethesda). .

Abstract

We report the first whole genome sequence (WGS) assembly and annotation of a dwarf coconut variety, 'Catigan Green Dwarf' (CATD). The genome sequence was generated using the PacBio SMRT sequencing platform at 15X coverage of the expected genome size of 2.15 Gbp, which was corrected with assembled 50X Illumina paired-end MiSeq reads of the same genome. The draft genome was improved through Chicago sequencing to generate a scaffold assembly that results in a total genome size of 2.1 Gbp consisting of 7,998 scaffolds with N50 of 570,487 bp. The final assembly covers around 97.6% of the estimated genome size of coconut 'CATD' based on homozygous k-mer peak analysis. A total of 34,958 high-confidence gene models were predicted and functionally associated to various economically important traits, such as pest/disease resistance, drought tolerance, coconut oil biosynthesis, and putative transcription factors. The assembled genome was used to infer the evolutionary relationship within the palm family based on genomic variations and synteny of coding gene sequences. Data show that at least three (3) rounds of whole genome duplication occurred and are commonly shared by these members of the Arecaceae family. A total of 7,139 unique SSR markers were designed to be used as a resource in marker-based breeding. In addition, we discovered 58,503 variants in coconut by aligning the Hainan Tall (HAT) WGS reads to the non-repetitive regions of the assembled CATD genome. The gene markers and genome-wide SSR markers established here will facilitate the development of varieties with resilience to climate change, resistance to pests and diseases, and improved oil yield and quality.

Keywords: Cocos nucifera L.; Dovetail Chicago sequencing; Illumina Miseq Sequencing; PacBio SMRT sequencing; SSR and SNP markers; dwarf coconut; genome assembly; hybrid assembly.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Insertion time distributions of intact LTR in the ‘CATD’ coconut genome estimated using the Jukes-Cantor model (Jukes and Cantor 1969) for noncoding sequences, and mutation rate of 1.3 × 10−8 mutations per site per year (Ma and Bennetzen 2004).
Figure 2
Figure 2
Syntenic dotplot between dwarf coconut var. Catigan Green dwarf (CATD) and tall coconut var. Hainan Tall (2a), CATD and date palm (P. dactylifera) (2b), and CATD and oil palm (E. guineensis; 2c). The dotplot axis matrix is in nucleotides with square dotplot axes relationship. The scaffolds in the y-axis of both (a) and (b) are arranged in the same manner by order of scaffold number. Scaffolds in the y-axis of (c) are sorted based on the Syntenic Path Assembly (SPA) using oil palm pseudomolecules as reference. The figures are generated using the Legacy Version of CoGe SynMap tool (Lyons et al. 2008).
Figure 3
Figure 3
Histogram depicting the synonymous rate change of syntenic gene pairs between dwarf coconut and other closely related sequenced genomes. The syntenic gene pairs were identified by DAGChainer, and colored based on their synonymous substitution rate as calculated by CodeML of the CoGe SynMap tool (Lyons et al. 2008). Syntenic regions derived from speciation (orthologs) from shared whole genome duplication events (α, β and γ) are also labeled.
Figure 4
Figure 4
Maximum likelihood phylogenetic tree generated using IQ-TREE from the sequence alignment of all the predicted RGAs characterized in the ‘CATD’ genome assembly. JTT amino acid substitution model (Jones et al. 1992) with empirical codon frequencies (+F) and FreeRate (+R9) rate heterogeneity across sites (Yang 1995; Soubrier et al. 2012) was used to generate the tree, validated with 1000 replicates of ultrafast bootstrapping (Hoang et al. 2017) and SH-aLRT (Guindon et al. 2010) tests. The branches colored as red are for TM-CC, blue for NBS-containing and green for TX and TN resistance gene analogs.
Figure 5
Figure 5
Occurrence of sequence variations in the non-repeat region of coconut based on map alignment of ‘HAT’ WGS reads to the assembled ‘CATD’ genome. (a) Distribution of the type of coconut SNPs (transversions and transitions) detected; (b) frequency of occurrence of each SNP and bp length of InDels identified in coconut. Negative values signify deletion while positive values are insertions relative to the sequence of the assembled ‘CATD’ genome.

Similar articles

Cited by

References

    1. Abraham A., and Mathew P. M., 1963. Cytology of coconut endosperm. Ann. Bot. 27: 505–512. 10.1093/oxfordjournals.aob.a083866 - DOI
    1. Agarwal P., Arora R., Ray S., Singh A. K., Singh V. P. et al. , 2007. Genome-wide identification of C 2 H 2 zinc-finger gene family in rice and their phylogeny and expression analysis. Plant Mol. Biol. 65: 467–485. 10.1007/s11103-007-9199-y - DOI - PubMed
    1. Al-Ghazi Y., Bourot S., Arioli T., Dennis E. S., and Llewellyn D. J., 2009. Transcript profiling during fiber development identifies pathways in secondary metabolism and cell wall structure that may contribute to cotton fiber quality. Plant Cell Physiol. 50: 1364–1381. 10.1093/pcp/pcp084 - DOI - PubMed
    1. Al-Mssallem I. S., Hu S., Zhang X., Lin Q., Liu W. et al. , 2013. Genome sequence of the date palm Phoenix dactylifera L. Nat. Commun. 4: 2274 10.1038/ncomms3274 - DOI - PMC - PubMed
    1. Al-Salih A. A., and Al-Rawi A. M. A., 1987. A study of the cytology of two female cultivars of date palm. Date Palm J. 5: 123–142.

Publication types

LinkOut - more resources