Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Oct;4(10):576-84.
doi: 10.1002/psp4.12009. Epub 2015 Sep 29.

Relating Chemical Structure to Cellular Response: An Integrative Analysis of Gene Expression, Bioactivity, and Structural Data Across 11,000 Compounds

Affiliations

Relating Chemical Structure to Cellular Response: An Integrative Analysis of Gene Expression, Bioactivity, and Structural Data Across 11,000 Compounds

B Chen et al. CPT Pharmacometrics Syst Pharmacol. 2015 Oct.

Abstract

A central premise in systems pharmacology is that structurally similar compounds have similar cellular responses; however, this principle often does not hold. One of the most widely used measures of cellular response is gene expression. By integrating gene expression data from Library of Integrated Network-based Cellular Signatures (LINCS) with chemical structure and bioactivity data from PubChem, we performed a large-scale correlation analysis of chemical structures and gene expression profiles of over 11,000 compounds taking into account confounding factors such as biological conditions (e.g., cell line, dose) and bioactivities. We found that structurally similar compounds do indeed yield similar gene expression profiles. There is an ∼20% chance that two structurally similar compounds (Tanimoto Coefficient ≥ 0.85) share significantly similar gene expression profiles. Regardless of structural similarity, two compounds tend to share similar gene expression profiles in a cell line when they are administrated at a higher dose or when the cell line is sensitive to both compounds.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Workflow of comparing transcriptomic similarity and structural similarity; (b) an example of comparing structural similarity and transcriptomic similarity between testosterone and medroxyprogesterone; (c) similarity measures.
Figure 2
Figure 2
Structural similarity vs. transcriptomic similarity. The pairs are grouped according to their structural similarities. (a) The mean of transcriptomic similarity within each group is plotted. The groups with the number of pairs less than 30 are ignored. (b) The distribution of transcriptomic similarity within each group is plotted. ECFP4 and Pearson landmark are used for measuring structural and transcriptomic similarity, respectively.
Figure 3
Figure 3
Structural similarity vs. transcriptomic similarity across (a) cell lines, (b) doses, and (c) treatment durations. ECFP4 and Pearson landmark are used for measuring structural and transcriptomic similarity, respectively.
Figure 4
Figure 4
(a) Expressions of landmark genes with different chemical perturbations in MCF7 with 6 hours treatment at 10 μm concentration. In the heatmap, each row is one landmark gene and each column is one compound colored by bioactivity in MCF7. Bioactivity is measured by the growth inhibition rate in MCF7. Green color represents active compounds and blue color represents inactive compounds. Red color shows high expression and blue color shows low expression in the heatmap. (b) Variation of gene expressions for active compounds and inactive compounds in MCF7, PC3, and A549. Variation is measured as the interquartile range of expression of landmark genes. (c) Transcriptomic similarity of the pairs consisting of two active compounds, the pairs consisting of two inactive compounds, and the pairs consisting of one inactive and one active compound. Three cell lines—MCF7, PC3, and A549—are used.
Figure 5
Figure 5
Gene expressions of “unexpected” pairs: (a) Pancuronium and vecuronium; (b) testosterone and norethindrone; (c) vincristine and vindesine; and (d) idarubicin and doxorubicin. In each plot, transcriptomic similarity (Cor), structural similarity (ECFP4), cell line, dose, treatment duration, and a few highly differentially expressed genes are annotated.

Similar articles

Cited by

References

    1. Martin YC, Kofron JL. Traphagen LM. Do structurally similar molecules have similar biological activity? J. Med. Chem. 2002;45:4350–4358. & ) - PubMed
    1. Stumpfe D. Bajorath J. Exploring activity cliffs in medicinal chemistry. J. Med. Chem. 2012;55:2932–2942. & ) - PubMed
    1. Bender A. Glen RC. Molecular similarity: a key technique in molecular informatics. Organ. Biomol. Chem. 2004;2:3204–3218. & ) - PubMed
    1. Nissen SE. Wolski K. Effect of rosiglitazone on the risk of myocardial infarction and death from cardiovascular causes. N. Engl. J. Med. 2007;356:2457–2471. & ) - PubMed
    1. Menon KVN, Angulo P. Lindor KD. Severe cholestatic hepatitis from troglitazone in a patient with nonalcoholic steatohepatitis and diabetes mellitus. Am. J. Gastroenterol. 2001;96:1631–1634. & ) - PubMed