Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(10):e44631.
doi: 10.1371/journal.pone.0044631. Epub 2012 Oct 2.

Integrating constitutive gene expression and chemoactivity: mining the NCI60 anticancer screen

Affiliations

Integrating constitutive gene expression and chemoactivity: mining the NCI60 anticancer screen

David G Covell. PLoS One. 2012.

Abstract

Studies into the genetic origins of tumor cell chemoactivity pose significant challenges to bioinformatic mining efforts. Connections between measures of gene expression and chemoactivity have the potential to identify clinical biomarkers of compound response, cellular pathways important to efficacy and potential toxicities; all vital to anticancer drug development. An investigation has been conducted that jointly explores tumor-cell constitutive NCI60 gene expression profiles and small-molecule NCI60 growth inhibition chemoactivity profiles, viewed from novel applications of self-organizing maps (SOMs) and pathway-centric analyses of gene expressions, to identify subsets of over- and under-expressed pathway genes that discriminate chemo-sensitive and chemo-insensitive tumor cell types. Linear Discriminant Analysis (LDA) is used to quantify the accuracy of discriminating genes to predict tumor cell chemoactivity. LDA results find 15% higher prediction accuracies, using ∼30% fewer genes, for pathway-derived discriminating genes when compared to genes derived using conventional gene expression-chemoactivity correlations. The proposed pathway-centric data mining procedure was used to derive discriminating genes for ten well-known compounds. Discriminating genes were further evaluated using gene set enrichment analysis (GSEA) to reveal a cellular genetic landscape, comprised of small numbers of key over and under expressed on- and off-target pathway genes, as important for a compound's tumor cell chemoactivity. Literature-based validations are provided as support for chemo-important pathways derived from this procedure. Qualitatively similar results are found when using gene expression measurements derived from different microarray platforms. The data used in this analysis is available at http://pubchem.ncbi.nlm.nih.gov/andhttp://www.ncbi.nlm.nih.gov/projects/geo (GPL96, GSE32474).

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The author has declared that no competing interests exist.

Figures

Figure 1
Figure 1. Schematic for calculating H-scores.
For each pathway, P, select genes in P (ovals at left) and not in P (squares in middle). Calculate a Pearson correlation coefficient between all gene expressions and each SOM GI50 profile (designated as ‘drug’ in the central square). Use a Kruskal-Wallis(K-W) statistic to determine if the rankings of the pathway genes (ovals at left) versus non-pathway genes (squares in the middle) are significantly skewed towards extreme values. The H-score for each pathway is derived from the K-W statistic. Cases where the average of correlations associated with pathway genes is greater than the average of correlations for non-pathway genes characterize positive H-scores. Positive and significant (p<0.05) H-scores reflect pathways with coordinated (as opposed to random) gene expression-chemoactivity correlations. H-scores for each pathway (n = 2160) are determined for all SOM GI50 profiles (n = 1998).
Figure 2
Figure 2. SOM GI50 projection of H-scores for the GO:proteasome pathway (GO:0005839, left panel) and the GO:binding pathway (GO:0005488, right panel).
The SOM GI50 is represented as a 2 dimensional map of 54 rows by 37 columns, corresponding to 1998 clusters. Pathway H-scores are projected spectrally on the SOM GI50 (red: best H-score, blue: worst H-score). The lower left region of the leftmost GI50 SOM has the cluster containing camptothecin (CPT). This corresponds to the node with the highest H-score for the GO:proteasome pathway. The GO:binding pathway consists of genes associated primarily with organic acid transport, particularly into the mitochondrion. The majority of NCI60 screened compounds associated with the GO:0005488 SOM GI50 nodes having the highest H-scores contain multiple carboxylate groups.
Figure 3
Figure 3. Illustration of the steps for trimming datasets.
Upper left panel: row ordered (gene expressions) and column ordered (tumor cells) gene set derived from the topmost 10th percentile of H-scores for this test example. Upper right panel: ordered chemoactivity profile (insensitive and sensitive tumor cells appear as + and - responses, respectively. Middle left panel: Ordered gene expressions for the trimmed set of discriminating genes. Middle right panel: Group averaged differential in gene expressions for chemo-sensitive and chemo-insensitive tumor cells. Lower left panel: Group averaged differential in gene expressions for over and under expressed discriminating genes. Lower right panel: Pearson correlation values (light:positive, dark:negative) for discriminating gene expressions. Letters correspond to response classes; A: over expressed/insensitive, B: under expressed/sensitive, C: under expressed/insensitive and D: over expressed/sensitive.
Figure 4
Figure 4. Illustrations of Taxol’s discriminating genes.
Top panel displays Taxol’s chemoactivity profile (+:sensitive −::insensitive) for its 25 discriminating genes, derived from the U133A dataset. The middle panel displays the relative gene expressions (light:over dark:under) for these same tumor cells. The lower panel displays the Pearson correlation scores (positive:light negative:dark) for taxol’s discriminating genes.
Figure 5
Figure 5. Discriminating gene expressions for MALME-3 (top panel) and SN12C (bottom panel).
Over and under expressed genes correspond to upward and downward directed bars, respectively. Genes are divided from left to right into groups of 15 and 50 genes, respectively. An average over expression of genes in the first group and under expression of genes in the second group corresponds to MALME-3 chemo-insensitivity. The convers holds for SN12C, where an average under expression of genes in the first group and over expression of genes in the second group correspond to chemo-sensitivity.

Similar articles

Cited by

References

    1. Teicher BA, Andrews PA (2010) Anticancer drug development guide preclinical screening, clinical trials, and approval. Totowa, N.J.: Humana Press.
    1. Yap TA, Sandhu SK, Workman P, de Bono JS (2010) Envisioning the future of early anticancer drug development. Nat Rev Cancer 10: 514–523. - PubMed
    1. Wilson TR, Johnston PG, Longley DB (2009) Anti-apoptotic mechanisms of drug resistance in cancer. Curr Cancer Drug Targets 9: 307–319. - PubMed
    1. Wang Y, Bolton E, Dracheva S, Karapetyan K, Shoemaker BA, et al. (2010) An overview of the PubChem BioAssay resource. Nucleic Acids Res 38: D255–266. - PMC - PubMed
    1. Li Q, Cheng T, Wang Y, Bryant SH (2010) PubChem as a public resource for drug discovery. Drug Discov Today 15: 1052–1057. - PMC - PubMed

MeSH terms

Grants and funding

The author has no support or funding to report.

LinkOut - more resources