Dissecting the genetics of complex traits using summary association statistics

Pasaniuc, Bogdan; Price, Alkes L.

doi:10.1038/nrg.2016.142

Review Article
Published: 14 November 2016

Dissecting the genetics of complex traits using summary association statistics

Bogdan Pasaniuc¹ &
Alkes L. Price^2,3

Nature Reviews Genetics volume 18, pages 117–127 (2017)Cite this article

20k Accesses
102 Altmetric
Metrics details

Subjects

Key Points

Summary association statistics from genome-wide association studies (GWAS) are widely available in large sample sizes across hundreds of complex traits. Analyses of such data can yield important insights, motivating the development of new statistical methods in this area.
Single variant association analysis (including meta-analyses, conditional association and imputation) can be performed effectively using summary association data. These methods often rely on linkage disequilibrium (LD) information from population reference panels.
Summary association data can be used to perform gene-based association tests to identify genes influencing complex traits. In particular, expression quantitative trait loci (eQTLs) can be integrated to identify genes whose expression levels influence complex traits, and rare variant association tests can aggregate evidence of association across multiple rare variants in a gene.
Statistical fine-mapping of causal variant (or variants) at GWAS loci can be performed using summary association data, leveraging information on the strength of association, functional genomic annotations and differences in LD patterns across different populations.
It is becoming increasingly clear that most complex traits and common diseases have a large number of causal variants with small effects. Summary association statistics can be used to understand these polygenic architectures and leverage them for polygenic risk prediction.
Summary association statistics have broad utility in cross-trait analyses, including detecting pleiotropic effects and inferring genetic correlations between traits. Pleiotropic effects can be used in Mendelian randomization analyses to draw inferences about causal relationships among traits.

Abstract

During the past decade, genome-wide association studies (GWAS) have been used to successfully identify tens of thousands of genetic variants associated with complex traits and diseases. These studies have produced extensive repositories of genetic variation and trait measurements across large numbers of individuals, providing tremendous opportunities for further analyses. However, privacy concerns and other logistical considerations often limit access to individual-level genetic data, motivating the development of methods that analyse summary association statistics. Here, we review recent progress on statistical methods that leverage summary association data to gain insights into the genetic basis of complex traits and diseases.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Illustration of summary association statistics.**

**Figure 2: TWAS using predicted expression and summary data.**

**Figure 3: Leveraging functional annotation and trans-ethnic data to improve fine-mapping.**

Genome-wide association studies

Article 26 August 2021

Boosting the power of genome-wide association studies within and across ancestries by using polygenic scores

Article 18 September 2023

Genome-wide large-scale multi-trait analysis characterizes global patterns of pleiotropy and unique trait-specific variants

Article Open access 14 August 2024

References

Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).
Article CAS PubMed PubMed Central Google Scholar
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Evangelou, E. & Ioannidis, J. P. Meta-analysis methods for genome-wide association studies and beyond. Nat. Rev. Genet. 14, 379–389 (2013).
CAS PubMed Google Scholar
Lin, D. Y. & Zeng, D. Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data. Genet. Epidemiol. 34, 60–66 (2010).
CAS PubMed Google Scholar
Han, B. & Eskin, E. Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. Am. J. Hum. Genet. 88, 586–598 (2011). This study introduces a powerful new random-effects meta-analysis method that uses a null model of no heterogeneity.
CAS PubMed PubMed Central Google Scholar
Han, B. & Eskin, E. Interpreting meta-analyses of genome-wide association studies. PLoS Genet. 8, e1002555 (2012).
CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012). This study demonstrates that conditional association analysis can be performed using summary statistics.
CAS PubMed PubMed Central Google Scholar
Wood, A. R. et al. Defining the role of common variation in the genomic and biological architecture of adult human height. Nat. Genet. 46, 1173–1186 (2014).
CAS PubMed PubMed Central Google Scholar
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206 (2015).
CAS PubMed PubMed Central Google Scholar
Shi, H., Kichaev, G. & Pasaniuc, B. Contrasting the genetic architecture of 30 complex traits from summary association data. Am. J. Hum. Genet. 99, 139–153 (2016).
CAS PubMed PubMed Central Google Scholar
Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).
CAS PubMed Google Scholar
Wen, X. & Stephens, M. Using linear predictors to impute allele frequencies from summary or pooled genotype data. Ann. Appl. Stat. 4, 1158–1182 (2010). This study is the first to show that Gaussian imputation methods can be applied to summary-level genetic data.
PubMed PubMed Central Google Scholar
Kostem, E., Lozano, J. A. & Eskin, E. Increasing power of genome-wide association studies by collecting additional single-nucleotide polymorphisms. Genetics 188, 449–460 (2011).
CAS PubMed PubMed Central Google Scholar
Lee, D., Bigdeli, T. B., Riley, B. P., Fanous, A. H. & Bacanu, S. A. DIST: direct imputation of summary statistics for unmeasured SNPs. Bioinformatics 29, 2925–2927 (2013).
CAS PubMed PubMed Central Google Scholar
Pasaniuc, B. et al. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics 30, 2906–2914 (2014).
CAS PubMed PubMed Central Google Scholar
Xu, Z. et al. DISSCO: direct imputation of summary statistics allowing covariates. Bioinformatics 31, 2434–2442 (2015).
CAS PubMed PubMed Central Google Scholar
Lee, D. et al. DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts. Bioinformatics 31, 3099–3104 (2015).
CAS PubMed PubMed Central Google Scholar
Park, D. S. et al. Adapt-Mix: learning local genetic correlation structure improves summary statistics-based analyses. Bioinformatics 31, i181–189 (2015).
CAS PubMed PubMed Central Google Scholar
Liu, J. Z. et al. A versatile gene-based test for genome-wide association studies. Am. J. Hum. Genet. 87, 139–145 (2010).
CAS PubMed PubMed Central Google Scholar
Li, M.-X., Gui, H.-S., Kwan, J. S. H. & Sham, P. C. GATES: a rapid and powerful gene-based association test using extended Simes procedure. Am. J. Hum. Genet. 88, 283–293 (2011).
CAS PubMed PubMed Central Google Scholar
Conneely, K. N. & Boehnke, M. So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests. Am. J. Hum. Genet. 81, 1158–1168 (2007).
CAS PubMed PubMed Central Google Scholar
Hormozdiari, F., Kichaev, G., Yang, W.-Y., Pasaniuc, B. & Eskin, E. Identification of causal genes for complex traits. Bioinformatics 31, i206–i213 (2015).
CAS PubMed PubMed Central Google Scholar
Berisa, T. & Pickrell, J. K. Approximately independent linkage disequilibrium blocks in human populations. Bioinformatics 32, 283–285 (2016).
CAS PubMed Google Scholar
Nicolae, D. L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).
PubMed PubMed Central Google Scholar
Nica, A. C. et al. Candidate causal regulatory effects by integration of expression QTLs with complex trait genetic associations. PLoS Genet. 6, e1000895 (2010).
PubMed PubMed Central Google Scholar
Xiong, Q., Ancona, N., Hauser, E. R., Mukherjee, S. & Furey, T. S. Integrating genetic and gene expression evidence into genome-wide association analysis of gene sets. Genome Res. 22, 386–397 (2012).
CAS PubMed PubMed Central Google Scholar
He, X. et al. Sherlock: detecting gene–disease associations by matching patterns of expression QTL and GWAS. Am. J. Hum. Genet. 92, 667–680 (2013).
CAS PubMed PubMed Central Google Scholar
Huang, Y. T., Liang, L., Moffatt, M. F., Cookson, W. O. C. M. & Lin, X. iGWAS: integrative genome-wide association studies of genetic and genomic data for disease susceptibility using mediation analysis. Genet. Epidemiol. 39, 347–356 (2015).
PubMed PubMed Central Google Scholar
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014). This study introduces a method for performing TWAS using summary statistics by assessing whether a single causal variant affects both gene expression and trait.
PubMed PubMed Central Google Scholar
Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 47, 381–386 (2015).
CAS PubMed PubMed Central Google Scholar
Fortune, M. D. et al. Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. Nat. Genet. 47, 839–846 (2015).
CAS PubMed PubMed Central Google Scholar
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
CAS PubMed PubMed Central Google Scholar
Lee, D. et al. JEPEG: a summary statistics based tool for gene-level joint testing of functional variants. Bioinformatics 31, 1176–1182 (2015).
PubMed Google Scholar
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016). This study identifies 69 new genes associated with obesity-related traits using a powerful new method for performing TWAS using summary statistics by assessing the association between predicted gene expression (using all cis SNPs) and trait.
CAS PubMed PubMed Central Google Scholar
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
CAS PubMed Google Scholar
Pavlides, J. M. W. et al. Predicting gene targets from integrative analyses of summary data from GWAS and eQTL studies for 28 human complex traits. Genome Med. 8, 84 (2016).
PubMed PubMed Central Google Scholar
Gibson, G. Rare and common variants: twenty arguments. Nat. Rev. Genet. 13, 135–145 (2011).
Google Scholar
Zuk, O. et al. Searching for missing heritability: designing rare variant association studies. Proc. Natl Acad. Sci. USA 111, E455–E464 (2014).
CAS PubMed Google Scholar
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
CAS PubMed PubMed Central Google Scholar
Lee, S., Teslovich, T. M., Boehnke, M. & Lin, X. General framework for meta-analysis of rare variants in sequencing association studies. Am. J. Hum. Genet. 93, 42–53 (2013). This study is the first of three studies to demonstrate that rare variant burden and overdispersion tests can be performed using summary statistics.
CAS PubMed PubMed Central Google Scholar
Hu, Y.-J. et al. Meta-analysis of gene-level associations for rare variants based on single-variant statistics. Am. J. Hum. Genet. 93, 236–248 (2013).
CAS PubMed PubMed Central Google Scholar
Liu, D. J. et al. Meta-analysis of gene-level tests for rare variant association. Nat. Genet. 46, 200–204 (2014).
CAS PubMed Google Scholar
Faye, L. L., Machiela, M. J., Kraft, P., Bull, S. B. & Sun, L. Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification. PLoS Genet. 9, e1003609 (2013).
CAS PubMed PubMed Central Google Scholar
Stephens, M. & Balding, D. J. Bayesian statistical methods for genetic association studies. Nat. Rev. Genet. 10, 681–690 (2009).
CAS PubMed Google Scholar
Wellcome Trust Case Control Consortium et al. Bayesian refinement of association signals for 14 loci in 3 common diseases. Nat. Genet. 44, 1294–1301 (2012). This study uses posterior probabilities of causality to construct credible sets of causal disease-associated SNPs across multiple loci and diseases under a single causal variant per locus assumption.
Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).
CAS PubMed PubMed Central Google Scholar
Kichaev, G. et al. Integrating functional data to prioritize causal variants in statistical fine-mapping studies. PLoS Genet. 10, e1004722 (2014).
PubMed PubMed Central Google Scholar
Chen, W. et al. Fine mapping causal variants with an approximate bayesian method using marginal test statistics. Genetics 200, 719–736 (2015).
PubMed PubMed Central Google Scholar
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
CAS PubMed PubMed Central Google Scholar
Newcombe, P. J., Conti, D. V. & Richardson, S. JAM: a scalable bayesian framework for joint analysis of marginal SNP effects. Genet. Epidemiol. 40, 188–201 (2016).
PubMed PubMed Central Google Scholar
Van de Bunt, M. et al. Evaluating the performance of fine-mapping strategies at common variant GWAS loci. PLoS Genet. 11, e1005535 (2015).
PubMed PubMed Central Google Scholar
Li, Y. & Kellis, M. Joint Bayesian inference of risk variants and tissue-specific epigenomic enrichments across multiple complex human diseases. Nucleic Acids Res. 44, e144 (2016).
PubMed PubMed Central Google Scholar
Udler, M. S. et al. FGFR2 variants and breast cancer risk: fine-scale mapping using African American studies and analysis of chromatin conformation. Hum. Mol. Genet. 18, 1692–1703 (2009).
CAS PubMed PubMed Central Google Scholar
Udler, M. S., Tyrer, J. & Easton, D. F. Evaluating the power to discriminate between highly correlated SNPs in genetic association studies. Genet. Epidemiol. 34, 463–468 (2010).
PubMed Google Scholar
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Maurano, M. T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).
CAS PubMed PubMed Central Google Scholar
Trynka, G. et al. Chromatin marks identify critical cell types for fine mapping complex trait variants. Nat. Genet. 45, 124–130 (2013).
CAS PubMed Google Scholar
Pickrell, J. K. Joint analysis of functional genomic data and genome-wide association studies of 18 human traits. Am. J. Hum. Genet. 94, 559–573 (2014). This study uses a Bayesian hierarchical model to estimate posterior probabilities of causality and to identify functional annotations enriched for disease heritability under a single causal variant per locus assumption.
CAS PubMed PubMed Central Google Scholar
Chung, D., Yang, C., Li, C., Gelernter, J. & Zhao, H. GPA: a statistical approach to prioritizing GWAS results by integrating pleiotropy and annotation. PLoS Genet. 10, e1004787 (2014).
PubMed PubMed Central Google Scholar
Kichaev, G. & Pasaniuc, B. Leveraging functional-annotation data in trans-ethnic fine-mapping studies. Am. J. Hum. Genet. 97, 260–271 (2015). This study shows that fine-mapping accuracy can be improved by leveraging functional annotation data and trans-ethnic samples and modelling multiple causal variants per locus.
CAS PubMed PubMed Central Google Scholar
Farh, K. K.-H. et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 518, 337–343 (2015).
CAS PubMed Google Scholar
Fuchsberger, C. et al. The genetic architecture of type 2 diabetes. Nature 536, 41–47 (2016).
CAS PubMed PubMed Central Google Scholar
Liu, C.-T. et al. Trans-ethnic meta-analysis and functional annotation illuminates the genetic architecture of fasting glucose and insulin. Am. J. Hum. Genet. 99, 56–75 (2016).
CAS PubMed PubMed Central Google Scholar
Grubert, F. et al. Genetic control of chromatin states in humans involves local and distal chromosomal interactions. Cell 162, 1051–1065 (2015).
CAS PubMed PubMed Central Google Scholar
Waszak, S. M. et al. Population variation and genetic control of modular chromatin architecture in humans. Cell 162, 1039–1050 (2015).
CAS PubMed Google Scholar
Zaitlen, N., Pasaniuc, B., Gur, T., Ziv, E. & Halperin, E. Leveraging genetic variability across populations for the identification of causal variants. Am. J. Hum. Genet. 86, 23–33 (2010).
CAS PubMed PubMed Central Google Scholar
Morris, A. P. Transethnic meta-analysis of genomewide association studies. Genet. Epidemiol. 35, 809–822 (2011).
PubMed PubMed Central Google Scholar
Ong, R. T.-H., Wang, X., Liu, X. & Teo, Y. Y. Efficiency of trans-ethnic genome-wide meta-analysis and fine-mapping. Eur. J. Hum. Genet. 20, 1300–1307 (2012).
PubMed Google Scholar
Asimit, J. L., Hatzikotoulas, K., McCarthy, M., Morris, A. P. & Zeggini, E. Trans-ethnic study design approaches for fine-mapping. Eur. J. Hum. Genet. 24, 1330–1336 (2016).
PubMed PubMed Central Google Scholar
Liu, C.-T. et al. Multi-ethnic fine-mapping of 14 central adiposity loci. Hum. Mol. Genet. 23, 4738–4744 (2014).
CAS PubMed PubMed Central Google Scholar
Kuo, J. Z. et al. Trans-ethnic fine mapping identifies a novel independent locus at the 3′ end of CDKAL1 and novel variants of several susceptibility loci for type 2 diabetes in a Han Chinese population. Diabetologia 56, 2619–2628 (2013).
CAS PubMed PubMed Central Google Scholar
Chatterjee, N., Shi, J. & Garcia-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).
CAS PubMed PubMed Central Google Scholar
Chatterjee, N. et al. Projecting the performance of risk prediction based on polygenic analyses of genome-wide association studies. Nat. Genet. 45, 400–405 (2013).
CAS PubMed PubMed Central Google Scholar
International Schizophrenia Consortium. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460, 748–752 (2009). This study uses polygenic risk scores to predict schizophrenia risk with appreciable accuracy, implicating a highly polygenic disease architecture.
Stahl, E. A. et al. Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis. Nat. Genet. 44, 483–489 (2012).
CAS PubMed PubMed Central Google Scholar
Vilhjalmsson, B. J. et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am. J. Hum. Genet. 97, 576–592 (2015).
CAS PubMed PubMed Central Google Scholar
Henderson, C. R. Best linear unbiased estimation and prediction under a selection model. Biometrics 31, 423–447 (1975).
CAS PubMed Google Scholar
de los Campos, G., Gianola, D. & Allison, D. B. Predicting genetic predisposition in humans: the promise of whole-genome markers. Nat. Rev. Genet. 11, 880–886 (2010).
CAS PubMed Google Scholar
Speed, D. & Balding, D. J. MultiBLUP: improved SNP-based prediction for complex traits. Genome Res. 24, 1550–1557 (2014).
CAS PubMed PubMed Central Google Scholar
Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013).
CAS PubMed PubMed Central Google Scholar
Moser, G. et al. Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model. PLoS Genet. 11, e1004969 (2015).
PubMed PubMed Central Google Scholar
Wray, N. R. et al. Pitfalls of predicting complex traits from SNPs. Nat. Rev. Genet. 14, 507–515 (2013).
CAS PubMed PubMed Central Google Scholar
Palla, L. & Dudbridge, F. A. Fast method that uses polygenic scores to estimate the variance explained by genome-wide marker panels and the proportion of variants affecting a trait. Am. J. Hum. Genet. 97, 250–259 (2015).
CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 19, 807–812 (2011).
PubMed PubMed Central Google Scholar
Yang, J. et al. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 42, 565–569 (2010).
CAS PubMed PubMed Central Google Scholar
Loh, P.-R. et al. Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis. Nat. Genet. 47, 1385–1392 (2015).
CAS PubMed PubMed Central Google Scholar
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43, 519–525 (2011).
CAS PubMed PubMed Central Google Scholar
Cotsapas, C. et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254 (2011).
CAS PubMed PubMed Central Google Scholar
Sivakumaran, S. et al. Abundant pleiotropy in human complex diseases and traits. Am. J. Hum. Genet. 89, 607–618 (2011).
CAS PubMed PubMed Central Google Scholar
Styrkársdottir, U. et al. Nonsense mutation in the LGR4 gene is associated with several human diseases and other traits. Nature 497, 517–520 (2013).
PubMed Google Scholar
Denny, J. C. et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31, 1102–1110 (2013).
CAS PubMed PubMed Central Google Scholar
Gusev, A. et al. Quantifying missing heritability at known GWAS loci. PLoS Genet. 9, e1003993 (2013).
PubMed PubMed Central Google Scholar
Stefansson, H. et al. CNVs conferring risk of autism or schizophrenia affect cognition in controls. Nature 505, 361–366 (2014).
CAS PubMed Google Scholar
Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016). This study applies a Bayesian framework to identify pleiotropic effects across a broad set of complex traits and diseases.
CAS PubMed PubMed Central Google Scholar
Voight, B. F. et al. Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study. Lancet 380, 572–580 (2012).
CAS PubMed PubMed Central Google Scholar
Burgess, S., Butterworth, A. & Thompson, S. G. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 37, 658–665 (2013).
PubMed PubMed Central Google Scholar
Burgess, S., Dudbridge, F. & Thompson, S. G. Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat. Med. 35, 1880–1906 (2016).
PubMed Google Scholar
Lee, S. H. et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nat. Genet. 45, 984–994 (2013).
CAS PubMed Google Scholar
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015). This study introduces a new method for estimating genome-wide genetic correlations from summary statistics.
CAS PubMed PubMed Central Google Scholar
Brown, B. C. et al. Transethnic genetic-correlation estimates from summary statistics. Am. J. Hum. Genet. 99, 76–88 (2016).
CAS PubMed PubMed Central Google Scholar
Nieuwboer, H. A., Pool, R., Dolan, C. V., Boomsma, D. I. & Nivard, M. G. GWIS: genome-wide inferred statistics for functions of multiple phenotypes. Am. J. Hum. Genet. 99, 917–927 (2016).
CAS PubMed PubMed Central Google Scholar
Hormozdiari, F. et al. Imputing phenotypes for genome-wide association studies. Am. J. Hum. Genet. 99, 89–103 (2016).
CAS PubMed PubMed Central Google Scholar
[No authors listed.] Asking for more. Nat. Genet. 44, 733 (2012).
Homer, N. et al. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4, e1000167 (2008).
PubMed PubMed Central Google Scholar
Yang, J., Zaitlen, N. A., Goddard, M. E., Visscher, P. M. & Price, A. L. Advantages and pitfalls in the application of mixed-model association methods. Nat. Genet. 46, 100–106 (2014).
PubMed PubMed Central Google Scholar
Sankararaman, S., Obozinski, G., Jordan, M. I. & Halperin, E. Genomic privacy and limits of individual detection in a pool. Nat. Genet. 41, 965–967 (2009).
CAS PubMed Google Scholar
Visscher, P. M. & Hill, W. G. The limits of individual identification from sample allele frequencies: theory and statistical analysis. PLoS Genet. 5, e1000628 (2009).
PubMed PubMed Central Google Scholar
Erlich, Y. & Narayanan, A. Routes for breaching and protecting genetic privacy. Nat. Rev. Genet. 15, 409–421 (2014).
CAS PubMed PubMed Central Google Scholar
Madsen, B. E. & Browning, S. R. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 5, e1000384 (2009).
PubMed PubMed Central Google Scholar
Li, B. & Leal, S. M. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am. J. Hum. Genet. 83, 311–321 (2008).
CAS PubMed PubMed Central Google Scholar
Price, A. et al. Pooled association tests for rare variants in exon resequencing studies. 86, 832–838 (2010).
Neale, B. M. et al. Testing for an unusual distribution of rare variants. PLoS Genet. 7, e1001322 (2011).
CAS PubMed PubMed Central Google Scholar
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
CAS PubMed PubMed Central Google Scholar
Daetwyler, H. D., Villanueva, B. & Woolliams, J. A. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PloS One 3, e3395 (2008).
PubMed PubMed Central Google Scholar
Lee, S. H., Wray, N. R., Goddard, M. E. & Visscher, P. M. Estimating missing heritability for disease from genome-wide association studies. Am. J. Hum. Genet. 88, 294–305 (2011).
PubMed PubMed Central Google Scholar
Perry, J. R. et al. Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche. Nature 514, 92–97 (2014).
CAS PubMed PubMed Central Google Scholar
Lambert, J. C. et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer's disease. Nat. Genet. 45, 1452–1458 (2013).
CAS PubMed PubMed Central Google Scholar
Zheng, H. F. et al. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture. Nature 526, 112–117 (2015).
CAS PubMed PubMed Central Google Scholar
Speliotes, E. K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).
CAS PubMed PubMed Central Google Scholar
Schunkert, H. et al. Large-scale association analysis identifies 13 new susceptibility loci for coronary artery disease. Nat. Genet. 43, 333–338 (2011).
CAS PubMed PubMed Central Google Scholar
Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).
CAS PubMed PubMed Central Google Scholar
Liu, J. Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).
CAS PubMed PubMed Central Google Scholar
Okbay, A. et al. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses. Nat. Genet. 48, 624–633 (2016).
CAS PubMed PubMed Central Google Scholar
Tobacco and Genetics Consortium. Genome-wide meta-analyses identify multiple loci associated with smoking behavior. Nat. Genet. 42, 441–447 (2010).
Manning, A. K. et al. A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat. Genet. 44, 659–669 (2012).
CAS PubMed PubMed Central Google Scholar
Soranzo, N. et al. Common variants at 10 genomic loci influence hemoglobin A1C levels via glycemic and nonglycemic pathways. Diabetes 59, 3229–3239 (2010).
CAS PubMed PubMed Central Google Scholar
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
CAS PubMed PubMed Central Google Scholar
Global Lipids Genetics Consortium. Discovery and refinement of loci associated with lipid levels. Nat. Genet. 45, 1274–1283 (2013).
Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).
CAS PubMed PubMed Central Google Scholar
Shungin, D. et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature 518, 187–196 (2015).
CAS PubMed PubMed Central Google Scholar
Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).
CAS PubMed Google Scholar
Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427 (2014).
Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).
CAS PubMed PubMed Central Google Scholar
Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics http://dx.doi.org/10.1093/bioinformatics/btw613 (2016).

Download references

Acknowledgements

The authors are grateful to H. Finucane, S. Gazal, N. Mancuso and H. Shi for helpful discussions, and to G. Kichaev and R. Johnson for help with figure 3. The work of the authors is funded by US National Institutes of Health grants R01 HG006399, R01 MH101244, R01 GM105857 and R01 MH107649.

Author information

Authors and Affiliations

Departments of Human Genetics, and Pathology and Laboratory Medicine, University of California, Los Angeles, 90095, California, USA
Bogdan Pasaniuc
Departments of Epidemiology and Biostatistics, Harvard T. H. Chan School of Public Health, Boston, 02115, Massachusetts, USA
Alkes L. Price
Program in Medical and Population Genetics, Broad Institute, Cambridge, 02142, Massachusetts, USA
Alkes L. Price

Authors

Bogdan Pasaniuc
View author publications
You can also search for this author in PubMed Google Scholar
Alkes L. Price
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Bogdan Pasaniuc or Alkes L. Price.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Glossary

Individual-level data: Genome-wide single nucleotide polymorphism genotypes and trait values for each individual included in a genome-wide association study.
Summary association statistics: Estimated effect sizes and their standard errors for each single nucleotide polymorphism analysed in a genome-wide association study.
z-scores: Association statistics that follow a standard normal distribution under the null model; often computed as per-allele effect sizes divided by their standard errors.
Meta-analysis: A method for combining data from different studies in which summary association statistics from each study are jointly analysed.
Mega-analysis: A method for combining data from different studies in which individual-level data from each study are merged and jointly analysed.
Summary LD information: (summary linkage disequilibrium information). In-sample correlations between each pair of typed single nucleotide polymorphisms analysed in a genome-wide association study; can be restricted to proximal pairs of typed SNPs to limit the number of pairs of SNPs.
Transcriptome-wide association studies: (TWAS). Studies that evaluate the association between the expression of each gene and a trait of interest; predicted expression may be used instead of measured expression to improve practicality.
Mendelian randomization: A method that uses significantly associated single nucleotide polymorphisms as instrumental variables to quantify causal relationships between two traits.
Burden tests: Gene-based rare variant tests in which all rare variants in a gene are assumed to have the same direction of effect.
Overdispersion tests: Gene-based rare variant tests in which rare variants in a gene are assumed to impact trait in either direction.
Posterior probability of causality: The inferred probability that a single nucleotide polymorphism is causal based on association data and optional prior information.
Polygenic risk scores: A method of predicting trait by summing the predicted marginal effects of all markers below a P value threshold in a training sample multiplied by marker genotypes in a validation sample.
LD score regression: A method of assessing trait polygenicity by regressing χ² association statistics against linkage disequilibrium (LD) scores for each single nucleotide polymorphism (SNP), computed as sums of squared correlations of each SNP with all SNPs including itself.
Pleiotropy: The existence of a genetic variant (or variants) that affects more than one trait.
Genetic correlation: The signed correlation across single nucleotide polymorphisms between causal effect sizes for two traits.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pasaniuc, B., Price, A. Dissecting the genetics of complex traits using summary association statistics. Nat Rev Genet 18, 117–127 (2017). https://doi.org/10.1038/nrg.2016.142

Download citation

Published: 14 November 2016
Issue Date: February 2017
DOI: https://doi.org/10.1038/nrg.2016.142