DOT: Gene-set analysis by combining decorrelated association statistics
- PMID: 32287273
- PMCID: PMC7182280
- DOI: 10.1371/journal.pcbi.1007819
DOT: Gene-set analysis by combining decorrelated association statistics
Abstract
Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including cost of management, difficulties in consolidation of records across research groups, etc. These issues make methods based on SNP-level summary statistics particularly appealing. The most common form of combining statistics is a sum of SNP-level squared scores, possibly weighted, as in burden tests for rare variants. The overall significance of the resulting statistic is evaluated using its distribution under the null hypothesis. Here, we demonstrate that this basic approach can be substantially improved by decorrelating scores prior to their addition, resulting in remarkable power gains in situations that are most commonly encountered in practice; namely, under heterogeneity of effect sizes and diversity between pairwise LD. In these situations, the power of the traditional test, based on the added squared scores, quickly reaches a ceiling, as the number of variants increases. Thus, the traditional approach does not benefit from information potentially contained in any additional SNPs, while our decorrelation by orthogonal transformation (DOT) method yields steady gain in power. We present theoretical and computational analyses of both approaches, and reveal causes behind sometimes dramatic difference in their respective powers. We showcase DOT by analyzing breast cancer and cleft lip data, in which our method strengthened levels of previously reported associations and implied the possibility of multiple new alleles that jointly confer disease risk.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
Similar articles
-
Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.BMC Genet. 2013 Nov 7;14:108. doi: 10.1186/1471-2156-14-108. BMC Genet. 2013. PMID: 24199751 Free PMC article.
-
Genetic risk factors for orofacial clefts in Central Africans and Southeast Asians.Am J Med Genet A. 2014 Oct;164A(10):2572-80. doi: 10.1002/ajmg.a.36693. Epub 2014 Aug 5. Am J Med Genet A. 2014. PMID: 25099202
-
Gene-based interaction analysis by incorporating external linkage disequilibrium information.Eur J Hum Genet. 2011 Feb;19(2):164-72. doi: 10.1038/ejhg.2010.164. Epub 2010 Oct 6. Eur J Hum Genet. 2011. PMID: 20924406 Free PMC article.
-
Molecular genetic studies of complex phenotypes.Transl Res. 2012 Feb;159(2):64-79. doi: 10.1016/j.trsl.2011.08.001. Epub 2011 Aug 31. Transl Res. 2012. PMID: 22243791 Free PMC article. Review.
-
The extent of linkage disequilibrium and computational challenges of single nucleotide polymorphisms in genome-wide association studies.Curr Drug Metab. 2011 Jun;12(5):498-506. doi: 10.2174/138920011795495312. Curr Drug Metab. 2011. PMID: 21453276 Review.
Cited by
-
A flexible summary statistics-based colocalization method with application to the mucin cystic fibrosis lung disease modifier locus.Am J Hum Genet. 2022 Feb 3;109(2):253-269. doi: 10.1016/j.ajhg.2021.12.012. Epub 2022 Jan 21. Am J Hum Genet. 2022. PMID: 35065708 Free PMC article.
-
Detecting Weak Signals by Combining Small P-Values in Genetic Association Studies.Front Genet. 2019 Nov 20;10:1051. doi: 10.3389/fgene.2019.01051. eCollection 2019. Front Genet. 2019. PMID: 31824555 Free PMC article.
-
The goldmine of GWAS summary statistics: a systematic review of methods and tools.BioData Min. 2024 Sep 5;17(1):31. doi: 10.1186/s13040-024-00385-x. BioData Min. 2024. PMID: 39238044 Free PMC article.
-
GRHL2 and AP2a coordinate early surface ectoderm lineage commitment during development.iScience. 2023 Feb 3;26(3):106125. doi: 10.1016/j.isci.2023.106125. eCollection 2023 Mar 17. iScience. 2023. PMID: 36843855 Free PMC article.
-
A comprehensive comparison of multilocus association methods with summary statistics in genome-wide association studies.BMC Bioinformatics. 2022 Aug 30;23(1):359. doi: 10.1186/s12859-022-04897-3. BMC Bioinformatics. 2022. PMID: 36042399 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Research Materials