Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Feb 12;1(1):e1400234.
doi: 10.1126/sciadv.1400234.

A unique chromatin complex occupies young α-satellite arrays of human centromeres

Affiliations

A unique chromatin complex occupies young α-satellite arrays of human centromeres

Jorja G Henikoff et al. Sci Adv. .

Abstract

The intractability of homogeneous α-satellite arrays has impeded understanding of human centromeres. Artificial centromeres are produced from higher-order repeats (HORs) present at centromere edges, although the exact sequences and chromatin conformations of centromere cores remain unknown. We use high-resolution chromatin immunoprecipitation (ChIP) of centromere components followed by clustering of sequence data as an unbiased approach to identify functional centromere sequences. We find that specific dimeric α-satellite units shared by multiple individuals dominate functional human centromeres. We identify two recently homogenized α-satellite dimers that are occupied by precisely positioned CENP-A (cenH3) nucleosomes with two ~100-base pair (bp) DNA wraps in tandem separated by a CENP-B/CENP-C-containing linker, whereas pericentromeric HORs show diffuse positioning. Precise positioning is largely maintained, whereas abundance decreases exponentially with divergence, which suggests that young α-satellite dimers with paired ~100-bp particles mediate evolution of functional human centromeres. Our unbiased strategy for identifying functional centromeric sequences should be generally applicable to tandem repeat arrays that dominate the centromeres of most eukaryotes.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1. CENP-A and CENP-C enrichment decreases with α-satellite divergence in pericentric heterochromatin.
Log-ratio CENP-A, CENP-C, and H3 enrichment profiles spanning the 40-kb most proximal annotated segment of chromosome arm Xp, which spans the DXZ1 α-satellite HOR gradient (3). Dense CENP-A and CENP-C enrichment diminishes with distance from the centromere-proximal edge, and depletion of H3 diminishes ~20 kb from the edge. Diverged α-satellite occupies the Xp arm punctuated by LINE-1 and other elements where centromere protein enrichment is low.
Fig. 2
Fig. 2. Variable CENP-A, CENP-C, and H3 occupancies at annotated α-satellite arrays.
Occupancy profiles for the most centromere-proximal 5-kb regions of eight HORs and monomeric α-satellite arrays present on BAC clones that have been tested for artificial centromere function (4), and for four selected HORs from the hg38 genomic assembly (2). The DXZ1 profile represents an enlargement of the rightmost 5 kb of Xp shown in Fig. 1. HORs are classified on the basis of localization by FISH (centromeric) (10, 35) or by an artificial chromosome assay (competent or inactive) (4). Within each segment, normalized count occupancies were scaled to the maximum occupancy of CENP-A ChIP using the IGV Genome Browser (46). The number in parentheses indicates the fold enrichment of the maximum relative to that of the D19Z1 HOR, which is set at 1, such that the maximum (CENP-A) peak in the D5Z2 HOR is 1376-fold higher than the maximum (H3) peak in the D19Z1 HOR, and the maximum (CENP-A) peak in the D11Z1 HOR is 93.1-fold higher than that in the D19Z1 HOR. Significant BLAST matches to the 17-bp CENP-B box consensus sequence (CTTCGTTGGAAACGGAA) are indicated (magenta lines).
Fig. 3
Fig. 3. Centromere proteins from multiple human individuals occupy the same subsets of α-satellite units.
(A) Clustering strategies for identifying the most abundant CENP-A ChIP-enriched sequences. (B) Phylogenetic tree representing the 20 ChIP and input reference sequences that were most abundantly enriched for CENP-A ChIP. Bootstrap percentages are shown for the earliest divergences, defining four branches on the basis of a 70% bootstrap threshold. The same four branches were obtained using only ChIP or only input reference sequences in the alignment. (C) Phylogeny representing the 10 most abundant CENP-A ChIP reference sequences from each of five individuals.
Fig. 4
Fig. 4. Young α-satellite dimers are the basic units of expansion and homogenization.
(A) Phylogenetic tree of the 20 most abundantly CENP-A-enriched input sequences, numbered by decreasing abundance and color-coded by clade. (B) Top: MegaBLAST alignments of 11 reference sequences to GenBank NW_001835979.1, where gray horizontal bars represent 100% identity and vertical red lines represent mismatches. Bottom: Same as top except for one of 11 HOR units of NT_167220.1. Numbers on the left are color-coded to correspond to clades in (A). (C) Overlaps of Cen-like and annotated α-satellites for CENP-A ChIP merged pairs.
Fig. 5
Fig. 5. Long tandem repeats of the Cen1-like consensus are detected in PacBio single sequence reads.
(A) Maps of BLASTN hits (boxes, where gray horizontal bars represent 100% identity, vertical red lines represent mismatches, and vertical black lines represent indels) in raw PacBio reads. Displayed are the 10 PacBio single sequence reads (indicated by their sequence read identifier) with the highest bit scores in a MegaBLAST search of SRR1304331 using the Cen1-like 340-bp query. Alternating hits are shown in two tiers for visual clarity. We attribute gaps in the array to the ~15% mostly indel error rate characteristic of PacBio raw data, an interpretation that is supported by the near-perfect alignment of BLAST hits to the 340-bp tiling shown as tandem black diamonds at bottom. (B) A consensus sequence was derived for each of the raw sequences indicated in (A) by automated alignment of the tandem BLAST hits, and a dendrogram was produced, rooting the tree with the Cen1-like consensus. (C) Alignment of the Cen1-like consensus (top sequence) identifies 44 ambiguous residues (indicated as “u” or “s”) and six indels (indicated as dashes) in the overall PacBio-derived consensus (bottom sequence) over the 340-bp sequence.
Fig. 6
Fig. 6. Two 100-bp CENP-A nucleosomes are precisely positioned over young, but not old, α-satellite units.
(A) Normalized count profiles of CENP-A and CENP-C ChIP occupancies mapped to the 340-bp Cen1-like consensus. (B) Same as (A) except mapped to the most abundantly enriched 340-bp noncentromeric α-satellite dimer derived from a centromere-competent chromosome 11 HOR (Fig. 2). (C) Same as (A) except for a Cen13-like dimer. (D) Same as (A) except for a Y-chromosome dimer, which lacks a CENP-B box.
Fig. 7
Fig. 7. Two distinct chromatin complexes occupy specific α-satellite arrays of human centromeres.
(A) Sequence divergence of selected dimeric units relative to the Cen1-like consensus dimers. (B) ChIP occupancy profiles for a composite 38-mer with dimers rank-ordered by divergence (green dots with indels indicated as triangles). (C) Same as (A) except for Cen13-like dimers. (D) Same as (C) except for a 16-mer Cen13-like composite sequence.
Fig. 8
Fig. 8. Young α-satellite dimers precisely position ~100-bp CENP-A nucleosomes.
(A to C) Size distributions of fragments mapping to the Cen1-like (A) and Cen13-like (B) composites and the most proximal 6-kb region of DXZ1 (C). Graphs on the right are expansions of graphs on the left (indicated by brackets). The y-axis scale is for input normalized counts, and the areas under the other curves were equalized to that for input.
Fig. 9
Fig. 9. Satellite DNA evolution by mutation and unequal crossing over [based on (6) and (47)].
In this toy example, a three-unit tandem array undergoes an out-of-register pairing event and unequal crossing over to produce a four-unit duplication and a two-unit deletion. Because the blue mutation is close to the left edge of the array, crossing-over events are most likely to occur to its right, and it will be inherited in both the duplication and deletion daughter chromosomes, whereas the red mutation is near the middle, and so it will be duplicated and deleted with similar expected frequencies. Further unequal crossing-over events within the four-unit array will result in expansion and contraction of the array, with corresponding gains and losses of the red mutation, leading to homogenization, but without consequence for the blue mutation. Other mutations that arise near the middle of the array will undergo homogenization like the red mutation, and those that arise near the edge will accumulate without gain or loss like the blue mutation. Over evolutionary time, the edges of the array will diverge, and longer-period out-of-register pairing and crossing-over events will result in HORs encompassing multiple tandem repeat units that are diverged from one another (3). Successive mutations and homogenization events in the middle of the array will result in divergence of homogeneous satellite sequences from the ancestral repeat unit.

Similar articles

Cited by

References

    1. Alexandrov I., Kazakov A., Tumeneva I, Shepelev V., Yurov Y., Alpha-satellite DNA of primates: Old and new families. Chromosoma 110, 253–266 (2001). - PubMed
    1. Benson D. A., Clark K., Karsch-Mizrachi I., Lipman D. J., Ostell J., Sayers E. W., GenBank. Nucleic Acids Res. 42, D32–D37 (2014). - PMC - PubMed
    1. Schueler M. G., Higgins A. W., Rudd M. K., Gustashaw K., Willard H. F., Genomic and genetic definition of a functional human centromere. Science 294, 109–115 (2001). - PubMed
    1. Hayden K. E., Strome E. D., Merrett S. L., Lee H.-R., Rudd M. K., Willard H. F., Sequences associated with centromere competency in the human genome. Mol. Cell. Biol. 33, 763–772 (2013). - PMC - PubMed
    1. Melters D. P., Bradnam K. R., Young H. A., Telis N., May M. R., Ruby J. G., Sebra R., Peluso P., Eid J., Rank D., Garcia J. F., DeRisi J. L., Smith T., Tobias C., Ross-Ibarra J., Korf I., Chan S. W. L., Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 14, R10 (2013). - PMC - PubMed