Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Feb 1;40(2):btae082.
doi: 10.1093/bioinformatics/btae082.

Estimation of inbreeding and kinship coefficients via latent identity-by-descent states

Affiliations

Estimation of inbreeding and kinship coefficients via latent identity-by-descent states

Yongtao Guan et al. Bioinformatics. .

Abstract

Motivation: Estimating the individual inbreeding coefficient and pairwise kinship is an important problem in human genetics (e.g. in disease mapping) and in animal and plant genetics (e.g. inbreeding design). Existing methods, such as sample correlation-based genetic relationship matrix, KING, and UKin, are either biased, or not able to estimate inbreeding coefficients, or produce a large proportion of negative estimates that are difficult to interpret. This limitation of existing methods is partly due to failure to explicitly model inbreeding. Since all humans are inbred to various degrees by virtue of shared ancestries, it is prudent to account for inbreeding when inferring kinship between individuals.

Results: We present "Kindred," an approach that estimates inbreeding and kinship by modeling latent identity-by-descent states that accounts for all possible allele sharing-including inbreeding-between two individuals. Kindred used non-negative least squares method to fit the model, which not only increases computation efficiency compared to the maximum likelihood method, but also guarantees non-negativity of the kinship estimates. Through simulation, we demonstrate the high accuracy and non-negativity of kinship estimates by Kindred. By selecting a subset of SNPs that are similar in allele frequencies across different continental populations, Kindred can accurately estimate kinship between admixed samples. In addition, we demonstrate that the realized kinship matrix estimated by Kindred is effective in reducing genomic control values via linear mixed model in genome-wide association studies. Finally, we demonstrate that Kindred produces sensible heritability estimates on an Australian height dataset.

Availability and implementation: Kindred is implemented in C with multi-threading. It takes vcf file or stream as input and works seamlessly with bcftools. Kindred is freely available at https://github.com/haplotype/kindred.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
Comparison of kinship estimates by different methods. Results of scGRM (dark gray) were obtained with PLINK. Results of popkin (light gray) were obtained with its R package. Results of UKin (green) were obtained from reimplementation in software Kindred (to take advantage of its multi-threading capacity). Results of King (blue) were obtained from its software King. Results of Kindread (plum) were obtained from its software Kindred. Expected kinship based on pedigree and zero values were marked by horizontal lines. For each method, we showed two violin plots: under the alternative (h1) and under the null (h0). The mean was marked by . Supplementary Table S1 contains numeric comparisons with mean ± SD under h1, and the percent of negative estimates under h0.
Figure 2.
Figure 2.
Kindred is effective for admixed samples using SNPs of SPD (color plum). Kinship estimated using RSC SNPs (colored in gray) tend to be biased and with large variation. In each panel, theoretical values and zero values were marked by horizontal lines.
Figure 3.
Figure 3.
East Asian samples clustering pattern on Chr17. Individuals were assigned colors based on PC2 and PC3 clustering in (A), and the same color assignments were used in (A), (B), (D), and (E). (A) Pairwise plot of top six Kindred PCs. The upper triangular plots were PCs of kinship matrix inferred using African allele frequencies as reference. The lower triangular plots, East Asian allele frequencies as reference (the diagonals are plots of j-th PC from one versus j-th PC from the other). (B) Phenotypes derived from PC2 and PC3 in upper triangular plots in (A). (C) Manhattan plot of single SNP (log10) Bayes factors between common bi-allelic SNPs on Chr17 and derived phenotypes in (B). (D) Pairwise plots of top six scGRM PCs. Without coloring, samples form no distinct clusters (lower triangle plots). With coloring (upper triangle), three groups of samples aggregate. (E) Derived phenotype from PC3 in (D). (F) Manhattan plot of single SNP (log10) Bayes factors between common bi-allelic SNPs on Chr17 and derived phenotypes in (E). Note the y-axis range in (F) is half of that in (C).
Figure 4.
Figure 4.
PC plots of 1000 genomes samples. The upper triangle are pairwise plots of top five PCs inferred with SPD SNPs. The lower triangle, RSC SNPs. The diagonals are plots of j-th PC from SPD SNPs versus j-th PC from RSC SNPs, with j=1,2,3,4,5. Five continental samples are Africans (in black), Americans (in red), East Asians (in green), Europeans (in blue), and South Asians (in cyan). On diagonal, their colors are in gray, plum, light green, light blue, and cyan.

Similar articles

Cited by

References

    1. Auton A, Brooks LD, Durbin RM. et al.; 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 2015;526:68–74. - PMC - PubMed
    1. Bro R, De Jong S.. A fast non-negativity-constrained least squares algorithm. J Chemometrics 1997;11:393–401.
    1. Chen H, Wang C, Conomos MP. et al. Control for population structure and relatedness for binary traits in genetic association studies via logistic mixed models. Am J Hum Genet 2016;98:653–66. - PMC - PubMed
    1. Devlin B, Roeder K.. Genomic control for association studies. Biometrics 1999;55:997–1004. - PubMed
    1. Goudet J, Kay T, Weir BS.. How to estimate kinship. Mol Ecol 2018;27:4121–35. - PMC - PubMed

Publication types