Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Oct 25;8(10):e78577.
doi: 10.1371/journal.pone.0078577. eCollection 2013.

Integrative pathway-based approach for genome-wide association studies: identification of new pathways for rheumatoid arthritis and type 1 diabetes

Affiliations

Integrative pathway-based approach for genome-wide association studies: identification of new pathways for rheumatoid arthritis and type 1 diabetes

Finja Büchel et al. PLoS One. .

Abstract

Genome-wide association studies (GWAS) led to the identification of numerous novel loci for a number of complex diseases. Pathway-based approaches using genotypic data provide tangible leads which cannot be identified by single marker approaches as implemented in GWAS. The available pathway analysis approaches mainly differ in the employed databases and in the applied statistics for determining the significance of the associated disease markers. So far, pathway-based approaches using GWAS data failed to consider the overlapping of genes among different pathways or the influence of protein-interactions. We performed a multistage integrative pathway (MIP) analysis on three common diseases--Crohn's disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D)--incorporating genotypic, pathway, protein- and domain-interaction data to identify novel associations between these diseases and pathways. Additionally, we assessed the sensitivity of our method by studying the influence of the most significant SNPs on the pathway analysis by removing those and comparing the corresponding pathway analysis results. Apart from confirming many previously published associations between pathways and RA, CD and T1D, our MIP approach was able to identify three new associations between disease phenotypes and pathways. This includes a relation between the influenza-A pathway and RA, as well as a relation between T1D and the phagosome and toxoplasmosis pathways. These results provide new leads to understand the molecular underpinnings of these diseases. The developed software herein used is available at http://www.cogsys.cs.uni-tuebingen.de/software/GWASPathwayIdentifier/index.htm.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Consistency of pathway sets generated with the proposed analysis methods.
The procedure described in this manuscript includes multiple analysis methods to identify significant pathways that are related to the phenotype of a given GWAS. This diagram shows with which analysis methods the consistent pathway sets that are listed in Table 2 have been determined. In average, 61% of the sets are determined with the interaction methods. In contrast, the characteristic interaction methods only identified less significant pathway sets. Concluding, it is more important to focus on the interaction based methods for the identification of important SNP sets.
Figure 2
Figure 2. Established analysis pipeline for a multistage integrative pathway analysis.
The analysis consists of three steps: 1) Construction of the SNP data structure, 2) Creation of the pathway sets, and 3) evaluation of these sets and determination of the best ones. The data structure is built by mapping all SNPs to their corresponding gene. Depending on the domain-interactions of the gene's proteins, each gene is assigned to a gene interaction class, which describes the number of interactions and the interaction confidence of the encoded proteins. For this purpose, information from UniProt, Pfam, DOMINE, and KEGG is used. Additionally, the KEGG pathways of the genes are determined. In the second step, four different pathway sets are built (for more details see Figure 3). In the final step, these sets are statistically evaluated with a variation of the Fisher's test statistic. Since there are several pathway sets built for one pathway, a best list is determined containing exclusively one set per pathway which has a p-value smaller or equal 0.05.
Figure 3
Figure 3. Definition of the interaction pathway sets.
We built four interaction pathway sets: ultra (yellow), high (red), medium (blue) and low (green) depending on the interaction classes of the genes. These interaction classes are either EV (experimentally validated), HC (high interaction confidence), MC (medium interaction confidence) and LC (low interaction confidence). The low interaction set is the superset of all interaction sets because it includes genes of all interaction classes. In contrast, the smallest ultra set only contains the genes of the EV class.
Figure 4
Figure 4. Pathway set creation example.
This example shows how the different pathway sets are built for a given pathway x. The pathway x is depicted as blue rectangle and the genes 1 to 5 as orange rectangles. For the pathway x, six different pathway sets are created. First, the pathway set containing the SNPs of all genes occurring in pathway x. Second, the characteristic pathway set that only contains the SNPs of those genes occurring exclusively in pathway x, i.e., genes 2, 3, 4 and 5. Since gene 1 also shows up in pathways y and z, the SNPs of these genes are not considered in the characteristic pathway interaction set. Third, two pathway interaction sets are created: first, the ultra-set for the SNPs of the genes assigned to the EV interaction class and second, the high-set with SNPs of genes assigned to the HC and EV class. Finally, the characteristic interaction pathway sets are generated. These sets are built similar to the interaction pathway sets but they contain only those SNPs of genes occurring exclusively in pathway x. In contrast to the ultra interaction set, the ultra characteristic interaction set does not include the SNPs 1, 2 and 3 because the corresponding gene also occurs in pathways y and z.

Similar articles

Cited by

References

    1. Cantor RM, Lange K, Sinsheimer JS (2010) Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am J Hum Genet 86: 6–22. - PMC - PubMed
    1. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. U. S. A. 106: 9362–9367. - PMC - PubMed
    1. Cordell HJ (2009) Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 10: 392–404. - PMC - PubMed
    1. Baranzini SE, Galwey NW, Wang J, Khankhanian P, Lindberg R, et al. (2009) Pathway and network-based analysis of genome-wide association studies in multiple sclerosis. Hum Mol Genet 18: 2078–2090. - PMC - PubMed
    1. Eleftherohorinou H, Wright V, Hoggart C, Hartikainen AL, Jarvelin MR, et al. (2009) Pathway analysis of GWAS provides new insights into genetic susceptibility to 3 inflammatory diseases. PloS One 4: e8068. - PMC - PubMed

Publication types