Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;6(11):e26980.
doi: 10.1371/journal.pone.0026980. Epub 2011 Nov 8.

Genome-wide functional analysis of the cotton transcriptome by creating an integrated EST database

Affiliations

Genome-wide functional analysis of the cotton transcriptome by creating an integrated EST database

Fuliang Xie et al. PLoS One. 2011.

Abstract

A total of 28,432 unique contigs (25,371 in consensus contigs and 3,061 as singletons) were assembled from all 268,786 cotton ESTs currently available. Several in silico approaches [comparative genomics, Blast, Gene Ontology (GO) analysis, and pathway enrichment by Kyoto Encyclopedia of Genes and Genomes (KEGG)] were employed to investigate global functions of the cotton transcriptome. Cotton EST contigs were clustered into 5,461 groups with a maximum cluster size of 196 members. A total of 27,956 indel mutants and 149,616 single nucleotide polymorphisms (SNPs) were identified from consensus contigs. Interestingly, many contigs with significantly high frequencies of indels or SNPs encode transcription factors and protein kinases. In a comparison with six model plant species, cotton ESTs show the highest overall similarity to grape. A total of 87 cotton miRNAs were identified; 59 of these have not been reported previously from experimental or bioinformatics investigations. We also predicted 3,260 genes as miRNAs targets, which are associated with multiple biological functions, including stress response, metabolism, hormone signal transduction and fiber development. We identified 151 and 4,214 EST-simple sequence repeats (SSRs) from contigs and raw ESTs respectively. To make these data widely available, and to facilitate access to EST-related genetic information, we integrated our results into a comprehensive, fully downloadable web-based cotton EST database (www.leonxie.com).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Sequence size distribution of consensus contigs and singletons in cotton.
Figure 2
Figure 2. Schematic pipeline for cotton EST assembly, data analysis and database development.
Figure 3
Figure 3. Gene Ontology (GO) analysis of 28,432 cotton annotated contigs.
The three GO categories are presented: cellular component (A), biological process (B), and molecular function (C).
Figure 4
Figure 4. Cluster size distribution of cotton contigs.
Figure 5
Figure 5. Homologous genomic comparison using several blast E-value cutoffs.
A. Distribution of percent cotton contigs finding a hit in each genome. B. Distribution of cotton homologous proteins identified in other plant species. C. Comparison of number of homologs identified between cotton and Vitis vinifera with a BLASTx E-value cutoff of 1e-30. D. The same comparison between cotton and Arabidopsis thaliana.
Figure 6
Figure 6
A. Distribution of length of miRNAs in cotton. B. Size distribution of cotton miRNA families with more than one member.
Figure 7
Figure 7. Interface of cotton EST database for querying raw ESTs (A), and assembled contigs (B).

Similar articles

Cited by

References

    1. IAC. Cotton: Review of World Situation, Monogram by International Advisory Committee. 1996. Washington, D.C.
    1. Zhang BH, Wang QL, Wang KB, Pan XP, Liu F, et al. Identification of cotton microRNAs and their targets. Gene. 2007;397:26–37. - PubMed
    1. Chen ZJ, Scheffler BE, Dennis E, Triplett BA, Zhang T, et al. Toward sequencing cotton (Gossypium) genomes. Plant Physiol. 2007;145:1303–1310. - PMC - PubMed
    1. Hendrix B, Stewart JM. Estimation of the nuclear DNA content of gossypium species. Ann Bot. 2005;95:789–797. - PMC - PubMed
    1. Seki M, Hayashida N, Kato N, Yohda M, Shinozaki K. Rapid construction of a transcription map for a cosmid contig of Arabidopsis thaliana genome using a novel cDNA selection method. Plant J. 1997;12:481–487. - PubMed

Publication types