Abstract
Standard techniques from genetic epidemiology are ill-suited to formally assess the significance of variants identified from a single case. We developed a statistical inference framework for identifying unusual functional variation from a single exome or genome, what we refer to as the 'n-of-one' problem. Using this approach we assessed our ability to identify the causal genotypes in over 5 million simulated cases of Mendelian disease, identifying 39% of disease genotypes as the most damaging unit in a typical exome background. We applied our approach to 129 n-of-one families from the Undiagnosed Diseases Program, nominating 60% of 30 disease genes determined to be diagnostic by a standard clinical workup. Our method can currently produce well-calibrated P values when applied to single genomes, can facilitate integration of multiple data types for n-of-one analyses, and, with further work, could become a widely used epidemiological method like linkage analysis or genome-wide association analysis.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Bamshad, M.J. et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat. Rev. Genet. 12, 745–755 (2011).
Gahl, W.A. et al. The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases. Genet. Med. 14, 51–59 (2012).
Cooper, G.M. & Shendure, J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat. Rev. Genet. 12, 628–640 (2011).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
Mitchell, A.A., Chakravarti, A. & Cutler, D.J. On the probability that a novel variant is a disease-causing mutation. Genome Res. 15, 960–966 (2005).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–424 (2015).
MacArthur, D.G. et al. Guidelines for investigating causality of sequence variants in human disease. Nature 508, 469–476 (2014).
Samocha, K.E. et al. A framework for the interpretation of de novo mutation in human disease. Nat. Genet. 46, 944–950 (2014).
Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
Lohmueller, K.E. The distribution of deleterious genetic variation in human populations. Curr. Opin. Genet. Dev. 29, 139–146 (2014).
Stenson, P.D. et al. The Human Gene Mutation Database (HGMD) and its exploitation in the fields of personalized genomics and molecular evolution. Curr. Protoc. Bioinformatics Chapter 1, Unit 13 (2012).
Landrum, M.J. et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 42, D980–D985 (2014).
Lopes, A.M. et al. Human spermatogenic failure purges deleterious mutation load from the autosomes and both sex chromosomes, including the gene DMRT1. PLoS Genet. 9, e1003349 (2013).
Kantarci, S. et al. Donnai–Barrow syndrome (DBS/FOAR) in a child with a homozygous LRP2 mutation due to complete chromosome 2 paternal isodisomy. Am. J. Med. Genet. A. 146A, 1842–1847 (2008).
Rey, R.A. et al. Male hypogonadism: an extended classification based on a developmental, endocrine physiology-based approach. Andrology 1, 3–16 (2013).
Petrovski, S., Wang, Q., Heinzen, E.L., Allen, A.S. & Goldstein, D.B. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet. 9, e1003709 (2013).
Huang, N., Lee, I., Marcotte, E.M. & Hurles, M.E. Characterising and predicting haploinsufficiency in the human genome. PLoS Genet. 6, e1001154 (2010).
Yandell, M. et al. A probabilistic disease-gene finder for personal genomes. Genome Res. 21, 1529–1542 (2011).
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case–control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Michaelson, J.J. et al. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation. Cell 151, 1431–1442 (2012).
Blekhman, R. et al. Natural selection on genes that underlie human disease susceptibility. Curr. Biol. 18, 883–889 (2008).
Tryka, K.A. et al. NCBI's Database of Genotypes and Phenotypes: dbGaP. Nucleic Acids Res. 42, D975–D979 (2014).
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
Liu, X., Jian, X. & Boerwinkle, E. dbNSFP v2.0: a database of human non-synonymous SNVs and their functional predictions and annotations. Hum. Mutat. 34, E2393–E2402 (2013).
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Lage, K. et al. A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes. Proc. Natl. Acad. Sci. USA 105, 20870–20875 (2008).
Koboldt, D.C. et al. Exome-based mapping and variant prioritization for inherited Mendelian disorders. Am. J. Hum. Genet. 94, 373–384 (2014).
Acknowledgements
We thank D. Wilson and M. Stephens for helpful comments, N. Huang for useful discussions and for providing updated versions of some annotations used in this work, K. Vigh-Conrad for assistance in preparing the figures, D. MacArthur, M. Lek and the members of the ExAC Consortium for generous prepublication sharing of their data, and M. Hoffmann and WU Kidney Translational Research Core (KTRC) for patient enrolment and Genome Technology Access Center (GTAC) for exome sequencing of CAKUT patients. Our work was supported by US National Institutes of Health grant R01MH101810 (to D.F.C.), March of Dimes Foundation grant #6-FY14-430 (to S.J.).
Author information
Authors and Affiliations
Contributions
D.F.C. designed the study. S.Z. provided helpful conceptual guidance on modeling and interpretation. D.F.C. wrote the simulation code. A.B.W. developed the PSAP pipeline, performed spike-in analyses, and evaluated the impact of population structure on PSAP values. D.F.C., A.B.W., D.R.A. and K.R.C. performed the UDP analyses. M.K. and S.J. contributed CAKUT samples and data. A.B.W. and D.F.C. wrote the manuscript with input from all authors.
Corresponding author
Ethics declarations
Competing interests
D.F.C. is funded by a research contract with PierianDx to develop novel methods for clinical exome analysis.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–17, Supplementary Tables 1–7 and Supplementary Note. (PDF 2925 kb)
Rights and permissions
About this article
Cite this article
Wilfert, A., Chao, K., Kaushal, M. et al. Genome-wide significance testing of variation from single case exomes. Nat Genet 48, 1455–1461 (2016). https://doi.org/10.1038/ng.3697
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng.3697
This article is cited by
-
DDX3Y is likely the key spermatogenic factor in the AZFa region that contributes to human non-obstructive azoospermia
Communications Biology (2023)
-
Diverse monogenic subforms of human spermatogenic failure
Nature Communications (2022)
-
Calibrated rare variant genetic risk scores for complex disease prediction using large exome sequence repositories
Nature Communications (2021)
-
A framework for high-resolution phenotyping of candidate male infertility mutants: from human to mouse
Human Genetics (2021)
-
Disruption of human meiotic telomere complex genes TERB1, TERB2 and MAJIN in men with non-obstructive azoospermia
Human Genetics (2021)