Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Nov 8;17(1):893.
doi: 10.1186/s12864-016-3221-1.

Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location

Affiliations

Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location

Agnieszka Zmienko et al. BMC Genomics. .

Abstract

Background: Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted.

Results: We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2-14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV.

Conclusions: We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding.

Keywords: Arabidopsis thaliana; Copy number variation (CNV); Droplet digital PCR; Genotyping; Multiallelic CNV; Multiplex ligation-dependent probe amplification (MLPA); Non-allelic homologous recombination (NAHR); Recurrent deletion.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The structure of Arabidopsis genomic region covered by the Ath_CNV610-611_MLPA assay. On the black axis the coordinates of the target genes (marked in orange below the axis) are given. On the brown axis the localization of each MLPA probe (mlpaA-mlpaH) is indicated. Probes mlpaC-mlpaG are located within CNV regions (marked in green). The size and distance between the elements in the figure are scaled
Fig. 2
Fig. 2
CNV patterns detected by the Ath_CNV610-611_MLPA assay. a-f, probe signal plots representative of accessions with: no copy number changes - “basic” genotype (a), deletion of AT3G18530 and AT3G18535 – “del-2” genotype (b), duplication of MSH2 – “dupl-1” genotype (c), duplication of AT3G18530 and AT3G18535 - “dupl-2” genotype (d), duplication of MSH2, AT3G18530 and AT3G18535 with equal signals from all probes - “dupl-3-a” genotype (e) or duplication of MSH2, AT3G18530 and AT3G18535 with increased signal from probe mlpaD - “dupl-3-b” (f). g, heatmaps of probe signals (rows) in accessions (columns) grouped by the CNV pattern. Color bars above the heatmaps indicate the genotypes represented by plots (a-f). Data on plots (a-f) are from accessions: Lag2-2, Vie-0, Kl-5, La-0, Uod-1 and Bak-7 respectively. See Table 1 for the description of MLPA probes
Fig. 3
Fig. 3
Clusters of Arabidopsis accessions containing different gene copy numbers identified with Ath_CNV610-611_MLPA assay. The scatterplots present signals of paired MLPA probes for: a MSH2 and b AT3G18530-AT3G18535. All results were calibrated using data obtained for Col-0 accession. The clustering and copy number (CN) assignment was done manually. The data points (accessions) are colored according to the CNV patterns described in Fig. 2
Fig. 4
Fig. 4
Concordance of MLPA-based and ddPCR-based gene copy number genotyping results in 92 accessions. On x-asis, ddPCR-based absolute gene copy numbers are shown. On y-axis, normalized MLPA signals are shown, generated with the MLPA probe located nearest to the ddPCR primers’ target position (Additional file 1: Figure S2). The data points (accessions) are colored according to the CNV patterns described in Fig. 2 and accessions with the highest levels of duplication are given unique symbols. Weaker data correlation observed for MSH2 in comparison with AT3G18530 and AT3G18535 is caused by the low number of accessions (12 out of 92 presented) with MSH2 gene copy number other than 2
Fig. 5
Fig. 5
Geographic distribution of the CNVs for the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes. The number of accessions sampled for each region (N) is reported. Colors indicate genotypes: “basic” (green); “del-2” (purple), “dupl-1” (blue), “dupl-2” (red), “dupl-3-a” or “dupl-3-b” (yellow). a - Iberian Peninsula & Morocco, b - Western Europe, c – Alps, d – Italy, e - Northern Europe, f - Central & Southeast Europe, g - Eastern Europe, h - Western Asia & Caucasus, i - Central Asia, j - East Asia, k - Pacific Northwest, l – Midwest
Fig. 6
Fig. 6
Haplotypes determined for the genomic regions surrounding MSH2, AT3G18530 and AT3G18535 loci, for 154 accessions. Bi-allelic SNPs of at least 10 % frequency located in 20-kb regions from both sides of the investigated CNV were analyzed. Dominant genotype in each position is marked in blue, alternative genotype is marked in yellow. SNP genomic coordinates are indicated on the top. CNV genotypes are marked as dark grey (“basic”), red (“del-2”) yellow (“dupl-1”), green (“dupl-2”) and dark green (“dupl-3-a” and “dupl-3-b”, collectively). The order of accessions reflects the distance-based tree generated in SplitsTree program
Fig. 7
Fig. 7
LCRs bordering AT3G18530 and AT3G18535 genes and their role in mediating NAHR-based gene copy number changes. a - Hypothesized model of nonhomologous pairing between the left (Chr3:6372413..6373650, red) and right (Chr3:6377368..6378605, blue) LCR, leading to deletion (in case of intrachromatidial NAHR) or to deletion and reciprocal duplication (in case of interchromatidial/interchromosomal NAHR) of the flanked genomic DNA. The actual site of strand exchange and the length of the conversion track determine the sequence of the repeat reconstituted after recombination: identical to one of the original LCRs or chimeric. Pink arrows show the estimate localizations of primers used for amplification of the breakpoint region in accessions with “del-2” genotype. Yellow arrows show the estimate localization of primers used for amplification of the breakpoint region in accessions with “dupl-2” genotype. RSE – region of strand exchange; b - Representative gel image of genomic DNA fragments amplified in accessions with “del-2” genotype and in Col-0 (“basic” genotype). The expected length of DNA amplicon is 3,421 bp (8,376 bp in case of Col-0); c - Gel image of genomic DNA fragments amplified in accessions with “dupl-2” genotype and in Col-0 (“basic” genotype). The expected length of DNA amplicon is 3,404 bp (no product in case of Col-0)
Fig. 8
Fig. 8
Sequence analysis of chromosome breakpoints in accessions with “del-2” and “dupl-2” genotypes. The 1238-bp long LCRs differ in 11 positions. Left and right LCR are colored in red and blue, respectively. The variable sites which differentiate both LCRs are indicated – they accumulate in the right half of the repeat. The upstream and downstream sequences directly adjacent to each LCR are marked in respective but less intensive color. White asterisks indicate gene conversion events. Single nucleotide substitution found in the variable site in one accession is presented on white background

Similar articles

Cited by

References

    1. Alkan C, Coe BP, Eichler EE. Genome structural variation discovery and genotyping. Nat Rev Genet. 2011;12:363–76. doi: 10.1038/nrg2958. - DOI - PMC - PubMed
    1. Chen W-K, Swartz JD, Rush LJ, Alvarez CE. Mapping DNA structural variation in dogs. Genome Res. 2009;19:500–9. doi: 10.1101/gr.083741.108. - DOI - PMC - PubMed
    1. Springer NM, Ying K, Fu Y, Ji T, Yeh C-T, Jia Y, et al. Maize inbreds exhibit high levels of copy number variation (CNV) and presence/absence variation (PAV) in genome content. PLoS Genet. 2009;5:e1000734. doi: 10.1371/journal.pgen.1000734. - DOI - PMC - PubMed
    1. Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A, et al. Diversity of human copy number variation and multicopy genes. Science. 2010;330:641–6. doi: 10.1126/science.1197005. - DOI - PMC - PubMed
    1. Haun WJ, Hyten DL, Xu WW, Gerhardt DJ, Albert TJ, Richmond T, et al. The composition and origins of genomic variation among individuals of the soybean reference cultivar Williams 82. Plant Physiol. 2011;155:645–55. doi: 10.1104/pp.110.166736. - DOI - PMC - PubMed

Publication types

LinkOut - more resources