Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties
- PMID: 18687880
- PMCID: PMC2556271
- DOI: 10.1101/gr.078303.108
Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties
Abstract
Microsatellites are abundant in vertebrate genomes, but their sequence representation and length distributions vary greatly within each family of repeats (e.g., tetranucleotides). Biophysical studies of 82 synthetic single-stranded oligonucleotides comprising all tetra- and trinucleotide repeats revealed an inverse correlation between the stability of folded-back hairpin and quadruplex structures and the sequence representation for repeats > or =30 bp in length in nine vertebrate genomes. Alternatively, the predicted energies of base-stacking interactions correlated directly with the longest length distributions in vertebrate genomes. Genome-wide analyses indicated that unstable sequences, such as CAG:CTG and CCG:CGG, were over-represented in coding regions and that micro/minisatellites were recruited in genes involved in transcription and signaling pathways, particularly in the nervous system. Microsatellite instability (MSI) is a hallmark of cancer, and length polymorphism within genes can confer susceptibility to inherited disease. Sequences that manifest the highest MSI values also displayed the strongest base-stacking interactions; analyses of 62 tri- and tetranucleotide repeat-containing genes associated with human genetic disease revealed enrichments similar to those noted for micro/minisatellite-containing genes. We conclude that DNA structure and base-stacking determined the number and length distributions of microsatellite repeats in vertebrate genomes over evolutionary time and that micro/minisatellites have been recruited to participate in both gene and protein function.
Figures
Similar articles
-
Characterization of rabbit DNA microsatellites extracted from the EMBL nucleotide sequence database.Anim Genet. 1996 Dec;27(6):387-95. doi: 10.1111/j.1365-2052.1996.tb00505.x. Anim Genet. 1996. PMID: 9022153
-
Molecular mechanisms for maintenance of G-rich short tandem repeats capable of adopting G4 DNA structures.Mutat Res. 2006 Jun 25;598(1-2):120-31. doi: 10.1016/j.mrfmmm.2006.01.014. Epub 2006 Mar 2. Mutat Res. 2006. PMID: 16513142
-
DNA repeats in the human genome.Genetica. 1999;106(1-2):15-36. doi: 10.1023/a:1003716509180. Genetica. 1999. PMID: 10710707
-
Trinucleotide repeats associated with human disease.Nucleic Acids Res. 1997 Jun 15;25(12):2245-54. doi: 10.1093/nar/25.12.2245. Nucleic Acids Res. 1997. PMID: 9171073 Free PMC article. Review.
-
Alternative DNA Structures In Vivo: Molecular Evidence and Remaining Questions.Microbiol Mol Biol Rev. 2020 Dec 23;85(1):e00110-20. doi: 10.1128/MMBR.00110-20. Print 2021 Feb 17. Microbiol Mol Biol Rev. 2020. PMID: 33361270 Free PMC article. Review.
Cited by
-
Coevolution between simple sequence repeats (SSRs) and virus genome size.BMC Genomics. 2012 Aug 30;13:435. doi: 10.1186/1471-2164-13-435. BMC Genomics. 2012. PMID: 22931422 Free PMC article.
-
Non-B DNA structure-induced genetic instability and evolution.Cell Mol Life Sci. 2010 Jan;67(1):43-62. doi: 10.1007/s00018-009-0131-2. Epub 2009 Sep 1. Cell Mol Life Sci. 2010. PMID: 19727556 Free PMC article. Review.
-
The accuracy, feasibility and challenges of sequencing short tandem repeats using next-generation sequencing platforms.PLoS One. 2014 Dec 1;9(12):e113862. doi: 10.1371/journal.pone.0113862. eCollection 2014. PLoS One. 2014. PMID: 25436869 Free PMC article.
-
Abundance, arrangement, and function of sequence motifs in the chicken promoters.BMC Genomics. 2014 Oct 15;15(1):900. doi: 10.1186/1471-2164-15-900. BMC Genomics. 2014. PMID: 25318583 Free PMC article.
-
Microsatellites explorer: A database of short tandem repeats across genomes.Comput Struct Biotechnol J. 2024 Oct 26;23:3817-3826. doi: 10.1016/j.csbj.2024.10.041. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39525087 Free PMC article.
References
-
- Ahrendt S.A., Decker P.A., Doffek K., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Decker P.A., Doffek K., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Doffek K., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Xu L., Demeure M.J., Jen J., Sidransky D., Demeure M.J., Jen J., Sidransky D., Jen J., Sidransky D., Sidransky D. Microsatellite instability at selected tetranucleotide repeats is associated with p53 mutations in non-small cell lung cancer. Cancer Res. 2000;60:2488–2491. - PubMed
-
- Applequist J., Damie V., Damie V. Thermodynamics of the one-stranded helix-coil equilibrium in polyadenylic acid. J. Am. Chem. Soc. 1966;88:3895–3900. - PubMed
-
- Bacolla A., Wells R.D., Wells R.D. Non-B DNA conformations, genomic rearrangements, and human disease. J. Biol. Chem. 2004;279:47411–47414. - PubMed
-
- Bacolla A., Pradhan S., Larson J.E., Roberts R.J., Wells R.D., Pradhan S., Larson J.E., Roberts R.J., Wells R.D., Larson J.E., Roberts R.J., Wells R.D., Roberts R.J., Wells R.D., Wells R.D. Recombinant human DNA (cytosine-5) methyltransferase. III. Allosteric control, reaction order, and influence of plasmid topology and triplet repeat length on methylation of the fragile X CGG.CCG sequence. J. Biol. Chem. 2001;276:18605–18613. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources