Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov 1;39(20):8677-88.
doi: 10.1093/nar/gkr593. Epub 2011 Jul 23.

Transcriptional gene network inference from a massive dataset elucidates transcriptome organization and gene function

Affiliations

Transcriptional gene network inference from a massive dataset elucidates transcriptome organization and gene function

Vincenzo Belcastro et al. Nucleic Acids Res. .

Abstract

We collected a massive and heterogeneous dataset of 20 255 gene expression profiles (GEPs) from a variety of human samples and experimental conditions, as well as 8895 GEPs from mouse samples. We developed a mutual information (MI) reverse-engineering approach to quantify the extent to which the mRNA levels of two genes are related to each other across the dataset. The resulting networks consist of 4 817 629 connections among 20 255 transcripts in human and 14 461 095 connections among 45 101 transcripts in mouse, with a inter-species conservation of 12%. The inferred connections were compared against known interactions to assess their biological significance. We experimentally validated a subset of not previously described protein-protein interactions. We discovered co-expressed modules within the networks, consisting of genes strongly connected to each other, which carry out specific biological functions, and tend to be in physical proximity at the chromatin level in the nucleus. We show that the network can be used to predict the biological function and subcellular localization of a protein, and to elucidate the function of a disease gene. We experimentally verified that granulin precursor (GRN) gene, whose mutations cause frontotemporal lobar degeneration, is involved in lysosome function. We have developed an online tool to explore the human and mouse gene networks.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Subnetworks obtained by collecting the top 1000 connections with the highest MI within the network. Subnetwork (a) contains genes that codify for the ‘ribosomal protein complex’, (‘Translation’, P < 1.0−113). Subnetwork (b) is enriched for genes involved in the ‘spindle checkpoint’ (‘Nuclear division’ P < 3.1−7), for clarity only a subset of interactions are shown. Subnetwork (c) is enriched for ‘metallothionein’ genes, a family of low molecular weight, heavy metal binding proteins. Subnetwork (d) contains major histocompatibility complex proteins (Antigen processing and presentation, P < 7.5−16). Pairs of genes are connected if their MI (probabilistic measure of relatedness) confirms a significant co-regulation.
Figure 2.
Figure 2.
Modular structure of the network. Adjacency matrix of the network before (a) and after (b) the hierarchical clustering procedure used to identify communities. Each dot represents a connection among two genes, that is a matrix entry, whose MI is greater than the significance threshold. (a) Genes are sorted according to their chromosomal location. Numbers on the x and y axes indicate chromosomes. (b) Genes are sorted according the community they belong to. Square dimensions are proportional to the number of genes in each community. The inset shows an enlargement of an area of the adjacency matrix where single dots are visible.
Figure 3.
Figure 3.
Community-wise network. Each node is a community. A color and a number identify each rich-club (i.e. a group of highly interconnected communities). The width of each edge reflects the IS between communities. ‘Exemplars’ are indicated by triangles. Examples of rich clubs: (a) communities of genes involved in ‘intracellular trafficking’; (b) communities involved in the ‘extracellular matrix maintenance, and cell mobility’; (c) communities involved in ‘immune response’; (d) communities of genes involved in house-keeping functions: ‘gene expression’ (rich-club 197), ‘translation’ (rich-club 246), ‘RNA processing’ (rich-club 2) and ‘cell cycle’ (rich-club 7).
Figure 4.
Figure 4.
Genes that are co-regulated tend to be physically close at the 3D chromatin level. (a) Connection tendency matrix of chromosome 19. Grey stripes highlight regions with no probes designed for the microarray model HG-U133A. A red color indicates two different 1 Mb loci whose genes are strongly connected to each other. (b) Physical contact matrix of chromosome 19. Grey stripes highlights chromosomal regions where centromeres are located, plus unalignable regions. A red color indicates two different 1 Mb loci that are physically close to each at the chromatin level. Physically close regions may also contain genes that are not co-expressed and vice versa: region (I) in (a) has an opposite tendency with respect to the corresponding region (I) in (b). This means that regions that are not in physical contact may contain genes that are co-expressed. The opposite can also be true, for example region (II) shows that loci physically interacting with each other do not necessarily contain genes that are co-expressed.
Figure 5.
Figure 5.
GRN is involved in lysosomal function. (a) GRN and CTSD increase in expression level in HeLa following sucrose treatment, a known inducer of lysosomes, as measured by realtime qPCR. (b) Expression level of GRN and of CTSD increase in HeLa cells following TFEB over-expression, a transcription factor known to regulate lysosome biogenesis. (c) Immunofluorescence with antibody anti-LAMP1 and anti-LAMP2, used as a lysosomal marker, of transfected HeLa cells over-expressing GRN, or EGFP and in HeLa cells grown in medium collected after GRN, or EGFP over-expression.
Figure 6.
Figure 6.
Electron microscopy of HeLa cells overexpressing GRN reveals increase in lysosomes size. Electron microscopy of Hela cells after GRN, or EGFP over-expression. Lysosomes (indicated in EM images by arrows) appear to be larger in cells that overexpress GRN. Morphometric analysis of lysosome diameter (meanSD; n = 0 cells) confirms the increase in the lysosome size in GRN-expressing cells.

Similar articles

Cited by

References

    1. Joshi-Tope G, Gillespie M, Vastrik I, D'Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath GR, Wu GR, Matthews L, et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Res. 2005;33 - PMC - PubMed
    1. Wu X, Jiang R, Zhang MQ, Li S. Network-based global inference of human disease genes. Mol. Syst. Biol. 2008;4 - PMC - PubMed
    1. Carro MS, Lim WK, Alvarez MJ, Bollo RJ, Zhao X, Snyder EY, Sulman EP, Anne SL, Doetsch F, Colman H, et al. The transcriptional network for mesenchymal transformation of brain tumours. Nature. 2009;463:318–325. - PMC - PubMed
    1. Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007;5:e8. - PMC - PubMed
    1. Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A. Reverse engineering of regulatory networks in human B cells. Nat. Genet. 2005;37:382–390. - PubMed

Publication types

Substances