Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Sep 7;91(3):478-88.
doi: 10.1016/j.ajhg.2012.08.004.

HYST: a hybrid set-based test for genome-wide association studies, with application to protein-protein interaction-based association analysis

Affiliations

HYST: a hybrid set-based test for genome-wide association studies, with application to protein-protein interaction-based association analysis

Miao-Xin Li et al. Am J Hum Genet. .

Abstract

The extended Simes' test (known as GATES) and scaled chi-square test were proposed to combine a set of dependent genome-wide association signals at multiple single-nucleotide polymorphisms (SNPs) for assessing the overall significance of association at the gene or pathway levels. The two tests use different strategies to combine association p values and can outperform each other when the number of and linkage disequilibrium between SNPs vary. In this paper, we introduce a hybrid set-based test (HYST) combining the two tests for genome-wide association studies (GWASs). We describe how HYST can be used to evaluate statistical significance for association at the protein-protein interaction (PPI) level in order to increase power for detecting disease-susceptibility genes of moderate effect size. Computer simulations demonstrated that HYST had a reasonable type 1 error rate and was generally more powerful than its parents and other alternative tests to detect a PPI pair where both genes are associated with the disease of interest. We applied the method to three complex disease GWAS data sets in the public domain; the method detected a number of highly connected significant PPI pairs involving multiple confirmed disease-susceptibility genes not found in the SNP- and gene-based association analyses. These results indicate that HYST can be effectively used to examine a collection of predefined SNP sets based on prior biological knowledge for revealing additional disease-predisposing genes of modest effects in GWASs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Diagram Illustrating How Statistical Significance for a Set of SNPs Is Calculated in HYST Vertical bars denote SNPs. When two blocks are on different chromosomes or far away on the same chromosome, the blocks can be assumed to be independent, that is, LD r2 between the two key SNPs of the two blocks equals zero. Step 1: define n blocks of SNPs according to LD and/or gene information; step 2: compute a block-based p value in each block by GATES and mark the SNPs from which the block-based p value is derived (the broad-brush in the plot); step 3: combine the n block-based p values by using scaled chi-square test, correcting for LD among the n key SNPs.
Figure 2
Figure 2
Box-Plots of Empirical Type 1 Errors and Statistical Powers of Various PPI-Based Association Tests (A–C) Empirical type 1 error (A); power under the alternative hypothesis A (B); and power under the alternative hypothesis of B (C), for the various PPI-based association tests. Detailed descriptions of the two alternative hypotheses are available in the Material and Methods section. Note that Fisher’s and Stouffer’s methods were excluded in the power comparison because of inflated type 1 error rates. Fisher: the Fisher’s combination test combining two gene-based p values calculated by GATES; Stouffer: Stouffer’s Z transform method combining two gene-based p values calculated by GATES. The following abbreviations are used: W.HYST, HYST with arbitrary weights (1:5 for SLC3A1:CAMKMT); ScaleChi, scaled chi-square test; LKM: the Logistic Kernel Machine Test.
Figure 3
Figure 3
Quantile-Quantile Plots of the PPI-Based p Values Calculated by HYST in a GWAS Simulated under the Null Hypothesis (A and B) The LD information was calculated from the actual genotypes of the subjects (A); and the ancestry-matched HapMap population (CHB) LD data (B) were used. Higgins’s I2 ≤ 0.5 was used to remove PPI pairs where two genes are significantly different in effect. The straight line represents the distribution of p values under the null hypothesis and the dotted lines represent estimated 95% confidence bands.
Figure 4
Figure 4
Network Views of Significant PPI Pairs in the Applications of HYST to Three GWAS Data Sets (A–C) The PPI-based association analyses were shown in the GWAS data sets of (A) CD, (B) T2D, and (C) RA. Higgins’s I2 ≤ 0.5 was used to remove PPI pairs where two genes are significantly different in effect. Each node and edge represents a gene (protein) and a PPI, respectively. Genes significant in either the SNP-based, the gene-based analysis, or both tests are colored in gray.

Similar articles

Cited by

References

    1. Hindorff L.A., Sethupathy P., Junkins H.A., Ramos E.M., Mehta J.P., Collins F.S., Manolio T.A. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA. 2009;106:9362–9367. - PMC - PubMed
    1. Eichler E.E., Flint J., Gibson G., Kong A., Leal S.M., Moore J.H., Nadeau J.H. Missing heritability and strategies for finding the underlying causes of complex disease. Nat. Rev. Genet. 2010;11:446–450. - PMC - PubMed
    1. Manolio T.A., Collins F.S., Cox N.J., Goldstein D.B., Hindorff L.A., Hunter D.J., McCarthy M.I., Ramos E.M., Cardon L.R., Chakravarti A. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
    1. Yang J., Benyamin B., McEvoy B.P., Gordon S., Henders A.K., Nyholt D.R., Madden P.A., Heath A.C., Martin N.G., Montgomery G.W. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010;42:565–569. - PMC - PubMed
    1. Yang J., Manolio T.A., Pasquale L.R., Boerwinkle E., Caporaso N., Cunningham J.M., de Andrade M., Feenstra B., Feingold E., Hayes M.G. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 2011;43:519–525. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources