Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct 13:16:781.
doi: 10.1186/s12864-015-2031-1.

Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants

Affiliations

Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants

Lua Lopez et al. BMC Genomics. .

Abstract

Background: Simple Sequence Repeats (SSRs) are widely used in population genetic studies but their classical development is costly and time-consuming. The ever-increasing available DNA datasets generated by high-throughput techniques offer an inexpensive alternative for SSRs discovery. Expressed Sequence Tags (ESTs) have been widely used as SSR source for plants of economic relevance but their application to non-model species is still modest.

Methods: Here, we explored the use of publicly available ESTs (GenBank at the National Center for Biotechnology Information-NCBI) for SSRs development in non-model plants, focusing on genera listed by the International Union for the Conservation of Nature (IUCN). We also search two model genera with fully annotated genomes for EST-SSRs, Arabidopsis and Oryza, and used them as controls for genome distribution analyses. Overall, we downloaded 16 031 555 sequences for 258 plant genera which were mined for SSRsand their primers with the help of QDD1. Genome distribution analyses in Oryza and Arabidopsis were done by blasting the sequences with SSR against the Oryza sativa and Arabidopsis thaliana reference genomes implemented in the Basal Local Alignment Tool (BLAST) of the NCBI website. Finally, we performed an empirical test to determine the performance of our EST-SSRs in a few individuals from four species of two eudicot genera, Trifolium and Centaurea.

Results: We explored a total of 14 498 726 EST sequences from the dbEST database (NCBI) in 257 plant genera from the IUCN Red List. We identify a very large number (17 102) of ready-to-test EST-SSRs in most plant genera (193) at no cost. Overall, dinucleotide and trinucleotide repeats were the prevalent types but the abundance of the various types of repeat differed between taxonomic groups. Control genomes revealed that trinucleotide repeats were mostly located in coding regions while dinucleotide repeats were largely associated with untranslated regions. Our results from the empirical test revealed considerable amplification success and transferability between congenerics.

Conclusions: The present work represents the first large-scale study developing SSRs by utilizing publicly accessible EST databases in threatened plants. Here we provide a very large number of ready-to-test EST-SSR (17 102) for 193 genera. The cross-species transferability suggests that the number of possible target species would be large. Since trinucleotide repeats are abundant and mainly linked to exons they might be useful in evolutionary and conservation studies. Altogether, our study highly supports the use of EST databases as an extremely affordable and fast alternative for SSR developing in threatened plants.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Flowchart of bioinformatics analysis used for developing the EST-SSR. The results of the SSR mining in the control genera Oryza and Arabidopsis are indicated in green on the left side of the figure while in blue, on the right side, are shown the results of the SSR mining in the IUCN plant genera (note that Oryza was used as a control genus and also included in the IUCN analyses). The steps followed for the analysis are highlighted with bold letters in the center
Fig. 2
Fig. 2
Di- and trinucleotides distribution obtained using QDD1 software from Oryza and Arabidopsis EST sequences that had a positive hit in the Oryza sativa (japonica cultivar-group) and Arabidopsis thaliana reference genomes database with BLASTn (NCBI). Oryza is represented in black while Arabidopsis is displayed in grey. The different types of motif are detailed in axis X while the number of SSRs for each class are showed in axis Y
Fig. 3
Fig. 3
Distribution of EST-SSRs in 193 plant genera including threatened species by the IUCN. Bars in the X axis represent each taxonomic group investigated and the whole dataset. The axis Y represents the percentage of EST-SSRs found within each group. Colors in each bar indicate the type of repeat: dinucleotide repeats in light green, trinucleotide repeats in light blue, tetranucleotide repeats in yellow, pentanucleotide repeats in dark green and hexanucleotide repeats in dark blue

Similar articles

Cited by

References

    1. Frankham R, Briscoe DA, Ballou JD. Introduction to Conservation Genetics. 2. Cambridge: Cambridge University Press; 2010.
    1. Höglund J. Evolutionary conservation genetics. Oxford: Oxford University Press; 2009.
    1. Allendorf FW, Luikart G. Conservation and the Genetics of Populations. 2. Malden: Blackwell Pub.; 2012.
    1. Ritland K. Marker-inferred relatedness as a tool for detecting heritability in nature. Mol Ecol. 2000;9:1195–1204. doi: 10.1046/j.1365-294x.2000.00971.x. - DOI - PubMed
    1. Squirrell J, Hollingsworth PM, Woodhead M, Russell J, Lowe AJ, Gibby M, Powell W. How much effort is required to isolate nuclear microsatellites from plants? Mol Ecol. 2003;12:1339–1348. doi: 10.1046/j.1365-294X.2003.01825.x. - DOI - PubMed

Publication types