Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Jul;13(4):446-59.
doi: 10.1093/bib/bbr075. Epub 2012 Mar 5.

Network biology methods integrating biological data for translational science

Affiliations
Review

Network biology methods integrating biological data for translational science

Gurkan Bebek et al. Brief Bioinform. 2012 Jul.

Abstract

The explosion of biomedical data, both on the genomic and proteomic side as well as clinical data, will require complex integration and analysis to provide new molecular variables to better understand the molecular basis of phenotype. Currently, much data exist in silos and is not analyzed in frameworks where all data are brought to bear in the development of biomarkers and novel functional targets. This is beginning to change. Network biology approaches, which emphasize the interactions between genes, proteins and metabolites provide a framework for data integration such that genome, proteome, metabolome and other -omics data can be jointly analyzed to understand and predict disease phenotypes. In this review, recent advances in network biology approaches and results are identified. A common theme is the potential for network analysis to provide multiplexed and functionally connected biomarkers for analyzing the molecular basis of disease, thus changing our approaches to analyzing and modeling genome- and proteome-wide data.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Workflow for high-throughput data integration to help understand the molecular basis of cancer. An integrative -omics signaling network identification process workflow that begins with processing tissue-specific data (instrument outputs) is shown. Microarray data is normalized to make comparisons of expression levels and transformed to select genes for further analysis. Genome-wide genotyping signals are analyzed to identify regions (and hence regional genes) for both tumor and normal tissue (or non-cancerous cells). Next, genomic regions with significant aberrations are merged with their corresponding microarray probes to create expression profiles. In this analysis step, expression profiles are used to calculate Pearson's coexpression correlations among gene pairs. These results are fed into the Pathway Analysis Framework. Integrating gene–gene coexpression values, annotations from GO, known signaling pathways, protein sequence information, PPI networks and protein subcellular co-localization data, pathways are predicted and filtered. Significant pathway subnetworks are merged to form signaling networks connecting genes of interest. The networks and genomic alterations identified are put together to create a descriptive functional network, creating a molecular basis for the cancer studied. This type of workflow, which we utilized, can be applied to using integrative systems biology approaches to study cancer and other pathologies [8].
Figure 2:
Figure 2:
A data integration framework for using disparate -omic data sets together to identify functional sub-networks in complex phenotypes. Data/experimental procedures are shown on the upper panel, inferred information shown in solid boxes on the lower panel, computational algorithms are shown in dashed boxes on the lower panel are shown by solid lines pipeline for identification of disease-associated sub-networks. This framework was used to identify PPI sub-networks dysregulated in late-stage colorectal cancer, revealing novel targets that are dysregulated at the post-translational level, but were not captured by untargeted proteomic analysis (45).
Figure 3:
Figure 3:
Network-based prioritization of candidate disease genes. (A) Flow chart for network-based prioritization algorithms: -omic data are shown by green ellipses (top left), clinical data are shown by purple ellipses (top right), intermediary data are shown by cyan ellipses (left and right bottom two ellipses), computational algorithms and statistical analyses are shown by boxes, overall outcome of the framework is shown by a red ellipse (top middle). (B) Key principles employed by prioritization algorithms: Each panel shows part of a hypothetical PPI network, blue nodes (light grey) represent products of seed genes, red nodes (dark grey) represent products of candidate genes. Connectivity-based algorithms rank candidate genes based on their products direct interactions with product's of seed genes; information-flow based algorithms rank candidate genes based on themultiplicity of network paths between their products and products of seed genes; topological similarity based algorithms rank candidate genes based on the similarity of their products' location in the PPI network to that of the products of candidate genes.
Figure 4:
Figure 4:
Workflow for network detection. Networks are identified by jactivemodule using P-values from GWAS study (see text). Briefly, based on the P-values for each SNPs from GWAS, each gene has been assigned a P-value, then, they are superposed on the human PPI interactome derived from HPRD, finally, Cytoscape and jactivemodule are used to identify the network that is enriched with significant P-values. The color represents the P-values and nodes with gray color indicate that the P-values are missing from GWAS.
Figure 5:
Figure 5:
Biochemical networks for personalized medicine. Biochemical reaction networks are rooted in the mechanistic interactions that comprise biological pathways; as such, these networks are carefully constructed from a wealth of genomic and metabolic databases, as well as from detailed experimental and literature data. Once networks have been constructed and curated—to ensure mass and charge balance, and to minimize gaps in connectivity—they serve as a powerful platform for interpreting high-throughput data. Not only can the network provide functional pathway context for genetic, transcriptomic or other perturbations, but through constraint-based modeling, these perturbations can be directly related to emergent phenotypes. Incorporating information from transcriptional regulation and intracellular signaling can lead to improved ability of the model to replicate in vitro and in vivo conditions. The iterative process of generating and experimentally testing simulated predictions leads to a refined and accurate model that holds great promise for facilitating personalized medicine.

Similar articles

Cited by

References

    1. Fields S, Song O. A novel genetic system to detect protein-protein interactions. Nature. 1989;340(6230):245–6. - PubMed
    1. Gavin AC, Bosche M, Krause R, et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002;415(6868):141–7. - PubMed
    1. Ito T, Chiba T, Ozawa R, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA. 2001;98(8):4569–74. - PMC - PubMed
    1. Hartman JLt, Garvik B, Hartwell L. Principles for the buffering of genetic variation. Science. 2001;291(5506):1001–4. - PubMed
    1. Ho Y, Gruhler A, Heilbut A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002;415(6868):180–3. - PubMed

Publication types