Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jul 15;18(7):e1010281.
doi: 10.1371/journal.pgen.1010281. eCollection 2022 Jul.

Estimating the timing of multiple admixture events using 3-locus linkage disequilibrium

Affiliations

Estimating the timing of multiple admixture events using 3-locus linkage disequilibrium

Mason Liang et al. PLoS Genet. .

Abstract

Estimating admixture histories is crucial for understanding the genetic diversity we see in present-day populations. Allele frequency or phylogeny-based methods are excellent for inferring the existence of admixture or its proportions. However, to estimate admixture times, spatial information from admixed chromosomes of local ancestry or the decay of admixture linkage disequilibrium (ALD) is used. One popular method, implemented in the programs ALDER and ROLLOFF, uses two-locus ALD to infer the time of a single admixture event, but is only able to estimate the time of the most recent admixture event based on this summary statistic. To address this limitation, we derive analytical expressions for the expected ALD in a three-locus system and provide a new statistical method based on these results that is able to resolve more complicated admixture histories. Using simulations, we evaluate the performance of this method on a range of different admixture histories. As an example, we apply the method to the Colombian and Mexican samples from the 1000 Genomes project. The implementation of our method is available at https://github.com/Genomics-HSE/LaNeta.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Predicted weighted LD surfaces from simulations and theory for varying admixture times.
The heat maps are from simulations and the contours are plotted from Eq 7. The two admixture probabilities were fixed at m1 = m2 = .2 and the the times of the two admixture pulses, T1 and T2, were varied. Each square covers the range 0.5 cM < d, d′ < 20 cM. When time of the more recent pulse is greater than half of that of the more ancient pulse, i.e. 2T1 > T1 + T2, the contours of the resulting weighted LD surface are straight, making it difficult to distinguish from the weighted LD surface produced by a one-pulse admixture scenario.
Fig 2
Fig 2. Predicted weighted LD surfaces from simulations and theory for varying admixture proportions.
The heat maps are from simulations and the contours are plotted from Eq 7. The two admixture times were fixed at 2 and 12 generations ago (T1 = 10 and T2 = 2) while the admixture probabilities were varied. Each square covers the range 0.5 cM < d, d′ < 20 cM. As the total admixture proportion m2 + m1(1 − m2) increases above 0.5, the contours change to reflecting that the majority contribution of the genetic material now originates from the other population. Weighted LD surfaces for m1 > 0.5 or m2 > 0.5 are not shown, but are qualitatively similar to the surfaces on the lower and rightmost sides.
Fig 3
Fig 3. Weighted LD surfaces produced by constant admixture.
The heat maps are from simulations and the contours from analytical results for a model in which continuous admixture started 10, 20, or 40 generations ago and stopped 5 generations before the present. Each square covers the range 0.5 cM < d, d′ < 20 cM. We varied the time of the beginning of the admixture and the total admixture probability. The admixture probability for each generation was constant, and chosen so that the total admixture proportion was either 0.3 or 0.7. When the admixture is spread over 5 generations (the leftmost column), the resulting weighted LD surface is similar to a one-pulse weighted LD surface. For longer durations, the weighted LD surfaces are similar to those produced by two pulses of admixture.
Fig 4
Fig 4. Two-locus weighted LD with two admixture events and varying pulse times.
Corresponding ALDER curves for two-pulse admixture with varying pulse times. Morgans on x-axis and log ALDER scores on y-axis. Red lines are T2 = 2, Green lines T2 = 5, and blue lines are T2 = 10.
Fig 5
Fig 5. Accuracy of estimates of T1 (A) and T2 (B), and ALDER estimates of admixture time (C) as a function of other parameters.
Twelve admixture scenarios, T1 ∈ {0, 5, 10, 20} and T2 ∈ {2, 5, 10}, were simulated 100 times each. The admixture probabilities were fixed at M1 = 0.3 and M2 = 0.2. The colored bars give the medians of estimates for each of these twelve cases, the boxes delimit the interquartile range, and the whiskers extend out to 1.5 times the interquartile range. As the time between the two pulses of admixture increases, the error in the estimates decreases (for this reason we do not include T1 = 0 accuracy estimate, in this case the results become unreasonable). Consistent with the simulations shown in Fig 1, there is limited power to estimate the time of the more ancient admixture pulse when T2 > T1. ALDER estimates a single admixture time which corresponds to T1 = 0.
Fig 6
Fig 6. Effect of admixture proportion misspecification on the estimated values of T1 and T2.
Admixture proportion misspecification has a strong effect on the estimates of time T1 between pulses of admixture. Estimates of the time T2 of the most recent admixture pulse remain stable.
Fig 7
Fig 7. Weighted LD surface for Mexican samples with Yoruba as the first source population reference.
The model with the best fit is two pulses from the non-Yoruba source population at T1 + T2 = 13.2 ± 1.01 and T2 = 7.9 ± 0.99 generations ago. The weighted LD surface was estimated from real data, the level lines correspond to the best-fitting model inferred by LaNeta method.
Fig 8
Fig 8. Weighted LD surface for Colombian samples with Yoruba as the first source population reference.
The two-pulse model that fits best is two pulses of non-Yoruba admixture at T1 + T2 = 14.5 ± 0.74 and T2 = 3.7 ± 0.62 generations ago. The amplitude of this weighted LD surface is approximately ten times larger than that of the Mexican samples. This is a result of larger proportion of Yoruba ancestry in the Colombian samples. The weighted LD surface was estimated from real data, the level lines correspond to the best-fitting model inferred by LaNeta method.

Similar articles

Cited by

References

    1. Reich D, Thangaraj K, Patterson N, Price AL, Singh L. Reconstructing Indian population history. Nature. 2009;461(7263):489–494. doi: 10.1038/nature08365 - DOI - PMC - PubMed
    1. Patterson N, Moorjani P, Luo Y, Mallick S, Rohland N, Zhan Y, et al.. Ancient admixture in human history. Genetics. 2012;192(3):1065–93. doi: 10.1534/genetics.112.145037 - DOI - PMC - PubMed
    1. Durand EY, Patterson N, Reich D, Slatkin M. Testing for ancient admixture between closely related populations. Mol Biol Evol. 2011;28(8):2239–52. doi: 10.1093/molbev/msr048 - DOI - PMC - PubMed
    1. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945–959. doi: 10.1093/genetics/155.2.945 - DOI - PMC - PubMed
    1. Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research. 2009;19(9):1655–1664. doi: 10.1101/gr.094052.109 - DOI - PMC - PubMed

Publication types