Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Jan;70(1):5-27.
doi: 10.1007/s00251-017-1017-3. Epub 2017 Jul 7.

A genomic perspective on HLA evolution

Affiliations
Review

A genomic perspective on HLA evolution

Diogo Meyer et al. Immunogenetics. 2018 Jan.

Abstract

Several decades of research have convincingly shown that classical human leukocyte antigen (HLA) loci bear signatures of natural selection. Despite this conclusion, many questions remain regarding the type of selective regime acting on these loci, the time frame at which selection acts, and the functional connections between genetic variability and natural selection. In this review, we argue that genomic datasets, in particular those generated by next-generation sequencing (NGS) at the population scale, are transforming our understanding of HLA evolution. We show that genomewide data can be used to perform robust and powerful tests for selection, capable of identifying both positive and balancing selection at HLA genes. Importantly, these tests have shown that natural selection can be identified at both recent and ancient timescales. We discuss how findings from genomewide association studies impact the evolutionary study of HLA genes, and how genomic data can be used to survey adaptive change involving interaction at multiple loci. We discuss the methodological developments which are necessary to correctly interpret genomic analyses involving the HLA region. These developments include adapting the NGS analysis framework so as to deal with the highly polymorphic HLA data, as well as developing tools and theory to search for signatures of selection, quantify differentiation, and measure admixture within the HLA region. Finally, we show that high throughput analysis of molecular phenotypes for HLA genes-namely transcription levels-is now a feasible approach and can add another dimension to the study of genetic variation.

Keywords: Balancing selection; Evolution; Genomics; HLA (human leukocyte antigen); MHC (major histocompatibility complex).

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
How genotyping errors arise from the mapping of reads to a single reference genome. The left panel represents a case where sequence reads come from an individual who is heterozygous at a SNP, but the rest of the gene is similar to the reference for both haplotypes. The reads from both haplotypes can be aligned to the reference, and the SNP genotype is called correctly (i.e., determined by the analysis software). The panel on the right shows a case where one of the haplotypes is different from the reference sequence at more positions than the mismatch threshold (in this simple example, only one mismatch is allowed). Reads from this haplotype will not align to the reference sequence and the genotype will be incorrectly called as homozygous at the SNP of interest. Modified from the Genes to Genomes blog, http://genestogenomes.org/the-trouble-with-hla-diversity/
Fig. 2
Fig. 2
The value of ψ, a statistic that measures the proportion of deleterious variants, in three sets of SNPs. The statistic is defined by ψ=LS.PNLN.(PS+1), where P represents the number of polymorphic sites, L represents the number of potentially mutable sites, and S and N subscripts refer to synonymous and nonsynonymous sites. Higher values of ψ indicate a greater proportion of deleterious (or functional, in the case of the SNPs from the classical HLA genes) variants. Values are shown for exons of classical HLA genes, genes in the immediate neighborhood of the HLA genes (“peri-HLA”), and genes outside the MHC region. Values were computed for sites with a minor allele frequency (MAF) greater than 0.05, to avoid the effect of rare deleterious variants, which are overrepresented in the control set. The peri-HLA genes have higher load (ψ) than the controls
Fig. 3
Fig. 3
F ST among pairs of populations. Each point depicts the mean F ST for non-HLA (x-axis) and HLA (y-axis) SNPs between pairs of populations in each continent (AFR: Africa; EAS: East Asia; EUR: Europe; SAS: Southeast Asia). Pairs of populations from the same continent are represented by white-filled points, and pairs of populations from different continents, by solid black points. SNP data was acquired from the 1000 Genomes data phase III (The 1000 Genomes Project Consortium 2015), and HLA SNPs were filtered according to Brandt et al. (2015) to avoid errors due to mapping bias. F ST values were weighted by allele frequency, so that the excess of rare variants in the non-HLA SNPs does not cause a reduction of mean F ST in that class. Notice that HLA differentiation is higher than genomewide for population pairs from the same continent, and lower than genomewide when populations from different continents are compared
Fig. 4
Fig. 4
Deviation from average genomewide ancestry in four admixed populations along chromosome 6. The degree to which local ancestry deviates from genomewide averages is shown for African ancestry (black lines). The region encompassing the MHC region is indicated by gray shading. Ancestral and admixed populations are from the 1000 genomes project (African and European; ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/), except for the ancestral Native American sample, which is from the HGDP-CEPH (http://www.cephb.fr/hgdp/index.php). Local ancestries were estimated using RFMIX (Maples et al. 2013). The ancestry deviation measure is the difference between ancestry at a given genomic position with respect to the genomewide average, normalized by the standard deviation of the ancestry estimate (thus providing a measure of the number of standard deviations each ancestry departs from its genomewide average)
Fig. 5
Fig. 5
Fold change in expression estimates obtained by kallisto (Bray et al. 2016) using a supplemented index relative to a standard reference index (y = 0). Results are presented for genotypes with different degrees of similarity to the reference genome (bar colors). We used 48 CEU individuals for which RNAseq data are available from the Geuvadis consortium (Lappalainen et al. 2013) and HLA genotypes were determined by Sanger sequencing (Gourraud et al. 2014). Genotypes at each locus were divided according to quartiles of differences from the reference allele at that locus. “Most similar” and “Most different” correspond to the first and fourth quartiles respectively (12 individuals each).

Similar articles

Cited by

References

    1. Abadie V, Sollid LM, Barreiro LB, Jabri B. Integration of genetic and immunological insights into a model of celiac disease pathogenesis. Annu Rev Immunol. 2011;29:493–525. doi: 10.1146/annurev-immunol-040210-092915. - DOI - PubMed
    1. Abi-Rached L, Jobin MJ, Kulkarni S, et al. The shaping of modern human immune systems by multiregional admixture with archaic humans. Science. 2011;334(6052):89–94. doi: 10.1126/science.1209202. - DOI - PMC - PubMed
    1. Ahlenstiel G, Martin MP, Gao X, Carrington M, Rehermann B. Distinct KIR/HLA compound genotypes affect the kinetics of human antiviral natural killer cell responses. J Clin Invest. 2008;188(2):1017–1026. - PMC - PubMed
    1. Akey JM. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 2009;19(5):711–722. doi: 10.1101/gr.086652.108. - DOI - PMC - PubMed
    1. Akey JM, Zhang G, Zhang K, Jin L, Shriver MD. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002;12(12):1805–1814. doi: 10.1101/gr.631202. - DOI - PMC - PubMed

Publication types

MeSH terms

Substances