Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 5;19(1):119.
doi: 10.1186/s12864-018-4490-7.

Sequence analysis of European maize inbred line F2 provides new insights into molecular and chromosomal characteristics of presence/absence variants

Affiliations

Sequence analysis of European maize inbred line F2 provides new insights into molecular and chromosomal characteristics of presence/absence variants

Aude Darracq et al. BMC Genomics. .

Abstract

Background: Maize is well known for its exceptional structural diversity, including copy number variants (CNVs) and presence/absence variants (PAVs), and there is growing evidence for the role of structural variation in maize adaptation. While PAVs have been described in this important crop species, they have been only scarcely characterized at the sequence level and the extent of presence/absence variation and relative chromosomal landscape of inbred-specific regions remain to be elucidated.

Results: De novo genome sequencing of the French F2 maize inbred line revealed 10,044 novel genomic regions larger than 1 kb, making up 88 Mb of DNA, that are present in F2 but not in B73 (PAV). This set of maize PAV sequences allowed us to annotate PAV content and to analyze sequence breakpoints. Using PAV genotyping on a collection of 25 temperate lines, we also analyzed Linkage Disequilibrium in PAVs and flanking regions, and PAV frequencies within maize genetic groups.

Conclusions: We highlight the possible role of MMEJ-type double strand break repair in maize PAV formation and discover 395 new genes with transcriptional support. Pattern of linkage disequilibrium within PAVs strikingly differs from this of flanking regions and is in accordance with the intuition that PAVs may recombine less than other genomic regions. We show that most PAVs are ancient, while some are found only in European Flint material, thus pinpointing structural features that may be at the origin of adaptive traits involved in the success of this material. Characterization of such PAVs will provide useful material for further association genetic studies in European and temperate maize.

Keywords: De novo assembly; Double strand break repair (DSBR); European germplasm; Genetic diversity; Linkage disequilibrium; Maize; Microhomology mediated end joining (MMEJ); Pan-genome; Presence absence variation (PAV); Structural variation.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Size distribution of microhomology stretches. a Distribution of size of microhomology stretches in PAVs (F2-specific sequences); b. Example of a 5 nt microhomology signature in a F2-specific region. Microhomology stretches are shown in bold, and PAV is highlighted by a box
Fig. 2
Fig. 2
Genomic distribution of B73- and F2-specific regions. PAV distributions along all chromosomes for 10 Mb sliding windows and 1 Mb steps. Each panel represents a different chromosome with chromosome number indicated on top. Grey boxes indicate position of peri-centromeric regions. For each chromosome, top panel: fraction of window covered by B73 genes, number of F2/B73 SNPs per window, PAVs (F2-specific and B73-specific regions). Numbers are scaled relatively to the highest value across the whole genome for each feature type. Bottom panel: fraction of window covered by B73 reads and F2 reads with mapping quality > 30 (no scaling was applied). Asterisks highlight regions of low diversity between F2 and B73. Greek letters represent regions with particular patterns (from visual inspection). Regions with Ω abundant SNP and scarce PAV; ω abundant SNP and abundant PAV; Δ abundant F2-specific regions and scarce B73-specific regions; δ scarce F2-specific regions and abundant B73-specific regions
Fig. 3
Fig. 3
Expression profile F2-specific genes. The number of F2-specific genes (green) expressed in none to all (12) sampled tissues is compared to the number of F2 genes shared with B73 (filtered gene set, red). As gene prediction in F2-specific regions was performed with mRNAseq support from F2 and several additional genotypes, some F2-specific genes can be found not expressed in F2 genotype for the 12 condition tested
Fig. 4
Fig. 4
Proportion of F2-present and B73-present PAVs in the core set of 23 maize lines. a: Proportion of PAVs, typed as F2 allele (Presence allele is typed). b: Proportion of PAVs, present in B73, (Absence allele is typed). Only PAVs with confident genotyping in all lines are represented. Proportions are in percent. Each bar represents one inbred line, with name indicated at the bottom. Colors highlight the 4 genetic groups represented in our core panel of 23 maize inbred lines. Inbred lines are ordered by number of shared variants with F2, from lower (left) to higher (right). B73 (0%) and F2 (100%) are not shown. Asterisks highlight inbred lines of French origin (for details on inbred line origin, see Additional file 1: Table S7)
Fig. 5
Fig. 5
Principal Component Analysis of the 25 inbred lines. Principal Component Analysis based on genotyping of (a) F2-present PAVs, (b) B73-present PAVs and (c) B73/F2 SNPs. Colors highlight the four main maize temperate genetic groups according to [46]
Fig. 6
Fig. 6
PAVs frequencies in maize genetic groups. a Hierarchical clustering of PAV frequency (F2 allele) within maize groups. Left: F2-present PAVs (typing of Presence variant). Right: B73-present PAVs (typing of Absence variant is shown). Horizontal lines represent PAVs. Vertical bars represent the four maize genetic groups. Light colors highlight low frequencies and strong colors indicate high frequencies of the F2 allele. b F2-present PAV frequency (left) or B73-present PAV frequency (right) within 2 genetic groups: Corn Belt Dent and European Flint. PAVs shared by F2 and a single genetic group (green) are separated from PAVs shared in at least one individual of the 4 genetic groups analyzed in this study (red). Left: sequences present in F2 and absent in B73. Right: sequences present in B73 and absent in F2
Fig. 7
Fig. 7
Average LD decay within PAV and between PAVs and their flanking regions. a Average LD between PAVs and their flanking genomic regions. Flanking regions is genotyped using SNPs, PAV is genotyped either as 0/1 (red) or using the within-PAV SNP closest to the breakpoint (kaki). For comparison, LD in random maize regions (green), maize random regions from gene space (blue), and maize random regions from inter-genic space (purple) are also plotted. b Comparison of average LD within PAVs to LD within flanking regions (upstream, red and downstream, kaki). PAVs are separated in 3 classes of size, 1–4 kb (green), 5–10 kb (blue), 10–40 kb (purple). LD is estimated by the squared correlations of allele frequency (r2) and plotted against distance between polymorphic sites (0 to 10 kb)

Similar articles

Cited by

References

    1. Saxena RK, Edwards D, Varshney RK. Structural variations in plant genomes. Brief Funct Genomics. 2014;13:296–307. doi: 10.1093/bfgp/elu016. - DOI - PMC - PubMed
    1. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7:85–97. doi: 10.1038/nrg1767. - DOI - PubMed
    1. Stankiewicz P, Lupski JR. Structural variation in the human genome and its role in disease. Annu Rev Med. 2010;61:437–455. doi: 10.1146/annurev-med-100708-204735. - DOI - PubMed
    1. Sharp AJ, Hansen S, Selzer RR, Cheng Z, Regan R, Hurst JA, et al. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat Genet. 2006;38:1038–1042. doi: 10.1038/ng1862. - DOI - PubMed
    1. Iskow RC, Gokcumen O, Lee C. Exploring the role of copy number variants in human adaptation. Trends Genet. 2012;28:245–257. doi: 10.1016/j.tig.2012.03.002. - DOI - PMC - PubMed

Publication types

Substances