Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 18;17 Suppl 4(Suppl 4):543.
doi: 10.1186/s12864-016-2792-1.

EpiTracer - an algorithm for identifying epicenters in condition-specific biological networks

Affiliations

EpiTracer - an algorithm for identifying epicenters in condition-specific biological networks

Narmada Sambaturu et al. BMC Genomics. .

Abstract

Background: In biological systems, diseases are caused by small perturbations in a complex network of interactions between proteins. Perturbations typically affect only a small number of proteins, which go on to disturb a larger part of the network. To counteract this, a stress-response is launched, resulting in a complex pattern of variations in the cell. Identifying the key players involved in either spreading the perturbation or responding to it can give us important insights.

Results: We develop an algorithm, EpiTracer, which identifies the key proteins, or epicenters, from which a large number of changes in the protein-protein interaction (PPI) network ripple out. We propose a new centrality measure, ripple centrality, which measures how effectively a change at a particular node can ripple across the network by identifying highest activity paths specific to the condition of interest, obtained by mapping gene expression profiles to the PPI network. We demonstrate the algorithm using an overexpression study and a knockdown study. In the overexpression study, the gene that was overexpressed (PARK2) was highlighted as the most important epicenter specific to the perturbation. The other top-ranked epicenters were involved in either supporting the activity of PARK2, or counteracting it. Also, 5 of the identified epicenters showed no significant differential expression, showing that our method can find information which simple differential expression analysis cannot. In the second dataset (SP1 knockdown), alternative regulators of SP1 targets were highlighted as epicenters. Also, the gene that was knocked down (SP1) was picked up as an epicenter specific to the control condition. Sensitivity analysis showed that the genes identified as epicenters remain largely unaffected by small changes.

Conclusions: We develop an algorithm, EpiTracer, to find epicenters in condition-specific biological networks, given the PPI network and gene expression levels. EpiTracer includes programs which can extract the immediate influence zone of epicenters and provide a summary of dysregulated genes, facilitating quick biological analysis. We demonstrate its efficacy on two datasets with differing characteristics, highlighting its general applicability. We also show that EpiTracer is not sensitive to minor changes in the network. The source code for EpiTracer is provided at Github ( https://github.com/narmada26/EpiTracer ).

Keywords: Condition-specific network; Influential nodes; Network mining; Perturbation analysis; Ripple centrality.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
The EpiTracer workflow. a Gene expression profiles of each condition are mapped onto the base PPI network. b Highest activity paths are calculated for each condition, and common paths are discarded, giving condition-specific highest activity paths (CSHAPs). c The network induced by the CSHAPs form the condition-specific highest activity networks (CSHANs). d Nodes in the perturbed highest activity network are ranked according to ripple centrality. e The ranked list of nodes is split into two lists based on overlaps with the control highest activity network. Top 10 nodes in the list unique to the perturbed condition form the epicenters specific to the perturbation
Fig. 2
Fig. 2
Illustration of ripple centrality. a Node Acl is the source of highly active paths, and has high closeness centrality. However it can only reach 4 nodes, and is not a good epicenter. b Node Aor can reach 14 nodes, but paths originating at Aor have low activity. Thus it is not a good epicenter. c Node Arc is the source of highly active paths and can reach a large number of nodes (7), making it the best candidate for an epicenter. The hexagon represents candidate epicenters
Fig. 3
Fig. 3
Case study 1 (PARK2 overexpression in human glioma cell line). Data corresponding to overexpression of PARK2 in human glioma cell line (U251). a Human PPI network comprising of 10,306 nodes and 74,404 edges. Nodes colored red are upregulated upon perturbation, and nodes colored green are downregulated. (a1) Table of network properties. b Perturbation-specific *HAN (highest activity network), with network properties in table b1. c The 5 epicenters which were differentially expressed, along with their immediate neighbors d List of epicenters specific to the perturbation as well as global epicenters
Fig. 4
Fig. 4
PARK2 influence zone. Detailed biological interpretation of PARK2 influence zone. a The PARK2 influence zone consists of 118 nodes and 119 edges. Red colored nodes correspond to upregulated genes, and green corresponds to downregulated genes. The epicenter is depicted using a hexagon. b GO enrichement of genes in PARK2 influence zone shows that most genes are involved in cell-cycle regulation. c Nodes downstream of PARK2 d Mechanistic insights into cell-cycle dysregulation upon PARK2 overexpression
Fig. 5
Fig. 5
Case study 2 (SP1 knockdown in HeLa cell line). Influence zone of the top 10 epicenters was constructed from the condition-specific highest activity network and enriched with the targets of SP1 and their immediate neighbors. This network was pruned to retain only epicenters, SP1 targets, differentially expressed genes, and the genes connecting them. Nodes with a hexagonal shape represent epicenters, a golden border around the node indicates SP1 target, and a pink border around the node indicates mediator gene. The rank of each epicenter is written next to it in red. (a) SP1 knockdown condition. 14 genes occur in the list of top 10 epicenters (5 genes correspond to rank 5). (b) Control condition. 50 genes correspond to top 10 epicenters. 30 genes correspond to rank 7, and regulate MYC, a target of SP1. Similarly, 9 genes correspond to rank 2, and regulate CEBPB

Similar articles

Cited by

References

    1. Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL. The human disease network. Proc Natl Acad Sci. 2007;104(21):8685–90. doi: 10.1073/pnas.0701361104. - DOI - PMC - PubMed
    1. Padiadpu J, Vashisht R, Chandra N. Protein–protein interaction networks suggest different targets have different propensities for triggering drug resistance. Syst Synth Biol. 2010;4(4):311–22. doi: 10.1007/s11693-011-9076-5. - DOI - PMC - PubMed
    1. Rowland MA, Fontana W, Deeds EJ. Crosstalk and competition in signaling networks. Biophys J. 2012;103(11):2389–98. doi: 10.1016/j.bpj.2012.10.006. - DOI - PMC - PubMed
    1. Edgar R, Domrachev M, Lash AE. Gene expression omnibus: Ncbi gene expression and hybridization array data repository. Nucleic Acids Res. 2002;30(1):207–10. doi: 10.1093/nar/30.1.207. - DOI - PMC - PubMed
    1. Martin F, Sewer A, Talikka M, Xiang Y, Hoeng J, Peitsch MC. Quantification of biological network perturbations for mechanistic insight and diagnostics using two-layer causal models. BMC Bioinformatics. 2014;15(1):238. doi: 10.1186/1471-2105-15-238. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources