Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 1;14(7):e0219195.
doi: 10.1371/journal.pone.0219195. eCollection 2019.

GAIL: An interactive webserver for inference and dynamic visualization of gene-gene associations based on gene ontology guided mining of biomedical literature

Affiliations

GAIL: An interactive webserver for inference and dynamic visualization of gene-gene associations based on gene ontology guided mining of biomedical literature

Daniel Couch et al. PLoS One. .

Abstract

In systems biology, inference of functional associations among genes is compelling because the construction of functional association networks facilitates biomarker discovery. Specifically, such gene associations in human can help identify putative biomarkers that can be used as diagnostic tools in treating patients. Although biomedical literature is considered a valuable data source for this task, currently only a limited number of webservers are available for mining gene-gene associations from the vast amount of biomedical literature using text mining techniques. Moreover, these webservers often have limited coverage of biomedical literature and also lack efficient and user-friendly tools to interpret and visualize mined relationships among genes. To address these limitations, we developed GAIL (Gene-gene Association Inference based on biomedical Literature), an interactive webserver that infers human gene-gene associations from Gene Ontology (GO) guided biomedical literature mining and provides dynamic visualization of the resulting association networks and various gene set enrichment analysis tools. We evaluate the utility and performance of GAIL with applications to gene signatures associated with systemic lupus erythematosus and breast cancer. Results show that GAIL allows effective interrogation and visualization of gene-gene networks and their subnetworks, which facilitates biological understanding of gene-gene associations. GAIL is available at http://chunglab.io/GAIL/.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The analysis workflow of GAIL gene-gene association network query.
Since the current network query only supports HGNC IDs, users can first map other gene symbols or synonyms to HGNC IDs using the ID Mapper (Step 1) and copy them to clipboard (Step 2). Next, users input these HGNC IDs and query the association network (Step 3). The gene-gene association network will display as in Step 4. Users can perform various types of network analysis and detect subgroups in the gene-gene association network (Step 5).
Fig 2
Fig 2. Distribution of cosine similarity values in the GAIL database.
The cosine similarity values corresponding to 50th, 95th, and 99th percentiles are also provided.
Fig 3
Fig 3. The gene-gene association networks produced by GAIL.
The networks produced by GAIL for the gene signatures associated with SLE (A: high confidence, B: moderate confidence) and breast cancer (C: high confidence, D: moderate confidence). Colors indicate clusters identified using the ‘Community Detection’ function in GAIL. Here, high and moderate confidences mean networks constructed with the top 1% and 5% most confident edges, respectively.
Fig 4
Fig 4. Downstream analysis using the lower-level data available in GAIL.
(A) K-means clustering result for the gene signatures associated with SLE and breast cancer, using the lower-level hypergeometric test p-values downloaded from the GAIL web webserver. The clustering result is displayed on the first two principal components. Red circles, blue triangles, and black squares indicate cluster memberships predicted by the k-means clustering algorithm (assuming 3 clusters) while red and black texts indicate the gene signatures associated with SLE and breast cancer, respectively. (B) Dendrogram of the genes associated with SLE, constructed by applying the hierarchical clustering algorithm to the lower-level hypergeometric test p-values downloaded from the GAIL web webserver. Colors of gene names indicate the cluster memberships identified using the dendrogram when the number of clusters was set to five.
Fig 5
Fig 5. Flowchart of the term co-occurrence-based text mining approach used in GAIL.
Gene names, symbols and alias were integrated from the HUGO, NCBI GenBank and Ensembl. GO names and terms were collected from the Gene Ontology Consortium.

Similar articles

Cited by

References

    1. Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature. 1999;402(6761 Suppl):C47–52. Epub 1999/12/11. 10.1038/35011540 . - DOI - PubMed
    1. Chuang HY, Hofree M, Ideker T. A Decade of Systems Biology. Annual Review of Cell and Developmental Biology. 2010;26:721–44. 10.1146/annurev-cellbio-100109-104122 - DOI - PMC - PubMed
    1. Chuang HY, Lee E, Liu YT, Lee D, Ideker T. Network-based classification of breast cancer metastasis. Molecular Systems Biology. 2007;3 10.1038/msb4100180 - DOI - PMC - PubMed
    1. Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, et al. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nature biotechnology. 2009;27(2):199–204. Epub 2009/02/03. 10.1038/nbt.1522 . - DOI - PubMed
    1. Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M, Loscalzo J, et al. Disease networks. Uncovering disease-disease relationships through the incomplete interactome. Science (New York, NY). 2015;347(6224):1257601 Epub 2015/02/24. 10.1126/science.1257601 - DOI - PMC - PubMed

Publication types