Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Oct;18(10):1545-53.
doi: 10.1101/gr.078303.108. Epub 2008 Aug 7.

Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties

Affiliations

Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties

Albino Bacolla et al. Genome Res. 2008 Oct.

Abstract

Microsatellites are abundant in vertebrate genomes, but their sequence representation and length distributions vary greatly within each family of repeats (e.g., tetranucleotides). Biophysical studies of 82 synthetic single-stranded oligonucleotides comprising all tetra- and trinucleotide repeats revealed an inverse correlation between the stability of folded-back hairpin and quadruplex structures and the sequence representation for repeats > or =30 bp in length in nine vertebrate genomes. Alternatively, the predicted energies of base-stacking interactions correlated directly with the longest length distributions in vertebrate genomes. Genome-wide analyses indicated that unstable sequences, such as CAG:CTG and CCG:CGG, were over-represented in coding regions and that micro/minisatellites were recruited in genes involved in transcription and signaling pathways, particularly in the nervous system. Microsatellite instability (MSI) is a hallmark of cancer, and length polymorphism within genes can confer susceptibility to inherited disease. Sequences that manifest the highest MSI values also displayed the strongest base-stacking interactions; analyses of 62 tri- and tetranucleotide repeat-containing genes associated with human genetic disease revealed enrichments similar to those noted for micro/minisatellite-containing genes. We conclude that DNA structure and base-stacking determined the number and length distributions of microsatellite repeats in vertebrate genomes over evolutionary time and that micro/minisatellites have been recruited to participate in both gene and protein function.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Correlation between Ta and tetraNR abundance in nine vertebrate genomes. Each of the 33 symbols represents one of the unique tetraNR sequences listed in Table 1 (column 2). For each sequence, the Ta (x-axis) is given by the highest value found for either the forward (column 4) or the reverse (column 7) oligonucleotide, whereas the relative abundance (y-axis) is the log of the mean fraction found for each tetraNR sequence relative to the total number of tetraNRs ≥8 units in the nine vertebrate genomes listed in Methods. The (solid line) regression was calculated only for those (solid circles) tetraNR sequences for which either the forward or reverse oligonucleotides displayed a Ta value within the measurable temperature range (10°C to 94°C). (Vertical lines) Standard errors; (blue) 99% confidence interval; (open circles) tetraNRs devoid of temperature-dependent structural transitions; (filled square) CGGG and (filled triangle) TGGG are tetraNRs with Ta values >94°C; (open square) CCGG with a Ta value of >94°C but not present in any genome.
Figure 2.
Figure 2.
Model for a relationship among repeat abundance, DNA structure, repeat polymorphism, and variable phenotype. (Orange) triNRs or tetraNRs. Note that overall repeat lengths are not drawn to scale and that the choice of decreased gene expression with increasing repeat length is arbitrary.

Similar articles

Cited by

References

    1. Ahrendt S.A., Decker P.A., Doffek K., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Decker P.A., Doffek K., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Doffek K., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Wang B., Xu L., Demeure M.J., Jen J., Sidransky D., Xu L., Demeure M.J., Jen J., Sidransky D., Demeure M.J., Jen J., Sidransky D., Jen J., Sidransky D., Sidransky D. Microsatellite instability at selected tetranucleotide repeats is associated with p53 mutations in non-small cell lung cancer. Cancer Res. 2000;60:2488–2491. - PubMed
    1. Al-Minawi A.Z., Saleh-Gohari N., Helleday T., Saleh-Gohari N., Helleday T., Helleday T. The ERCC1/XPF endonuclease is required for efficient single-strand annealing and gene conversion in mammalian cells. Nucleic Acids Res. 2008;36:1–9. - PMC - PubMed
    1. Applequist J., Damie V., Damie V. Thermodynamics of the one-stranded helix-coil equilibrium in polyadenylic acid. J. Am. Chem. Soc. 1966;88:3895–3900. - PubMed
    1. Bacolla A., Wells R.D., Wells R.D. Non-B DNA conformations, genomic rearrangements, and human disease. J. Biol. Chem. 2004;279:47411–47414. - PubMed
    1. Bacolla A., Pradhan S., Larson J.E., Roberts R.J., Wells R.D., Pradhan S., Larson J.E., Roberts R.J., Wells R.D., Larson J.E., Roberts R.J., Wells R.D., Roberts R.J., Wells R.D., Wells R.D. Recombinant human DNA (cytosine-5) methyltransferase. III. Allosteric control, reaction order, and influence of plasmid topology and triplet repeat length on methylation of the fragile X CGG.CCG sequence. J. Biol. Chem. 2001;276:18605–18613. - PubMed

Publication types

LinkOut - more resources