Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 18;12(6):933.
doi: 10.3390/genes12060933.

A Methodological Framework to Discover Pharmacogenomic Interactions Based on Random Forests

Affiliations

A Methodological Framework to Discover Pharmacogenomic Interactions Based on Random Forests

Salvatore Fasola et al. Genes (Basel). .

Abstract

The identification of genomic alterations in tumor tissues, including somatic mutations, deletions, and gene amplifications, produces large amounts of data, which can be correlated with a diversity of therapeutic responses. We aimed to provide a methodological framework to discover pharmacogenomic interactions based on Random Forests. We matched two databases from the Cancer Cell Line Encyclopaedia (CCLE) project, and the Genomics of Drug Sensitivity in Cancer (GDSC) project. For a total of 648 shared cell lines, we considered 48,270 gene alterations from CCLE as input features and the area under the dose-response curve (AUC) for 265 drugs from GDSC as the outcomes. A three-step reduction to 501 alterations was performed, selecting known driver genes and excluding very frequent/infrequent alterations and redundant ones. For each model, we used the concordance correlation coefficient (CCC) for assessing the predictive performance, and permutation importance for assessing the contribution of each alteration. In a reasonable computational time (56 min), we identified 12 compounds whose response was at least fairly sensitive (CCC > 20) to the alteration profiles. Some diversities were found in the sets of influential alterations, providing clues to discover significant drug-gene interactions. The proposed methodological framework can be helpful for mining pharmacogenomic interactions.

Keywords: Random Forests; cancer; cell lines; drug response; genomic alterations; pharmacogenomic interactions.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
(A) Alteration dataset with the 48,270 rows (alteration types, reported on the x-axis) and the 648 columns (cell lines, reported on the y-axis) in increasing order of alteration frequency. Black dots indicate altered cells. Frequency (percentages) above the plot indicate row positions at which those alteration frequencies are reached for the first time; (B) Response dataset with the 265 rows (compounds, reported on the x-axis) in increasing order of sample size, and the 648 columns (cell lines, reported on the y-axis) in increasing order of alteration frequency. Grey dots indicate missing AUCs. The two frequencies (percentages) above the plot indicate the largest and smallest number of missing AUCs, respectively.
Figure 2
Figure 2
(A) Concordance correlation coefficient (CCC) between alteration variances/squared correlations before and after missing data removal, as a function of sample size; (B) Frequency of variances/correlations violating the thresholds set, as a function of sample size.
Figure 3
Figure 3
(A) Computational times elapsed as a function of sample size. The p-value is from linear regression (red line); (B) Distribution of the stability indicator through the 265 models.
Figure 4
Figure 4
(A) Concordance correlation coefficient (CCC) distribution through the 265 Random Forests; (B) CCC as a function of sample size; (C) CCC as a function of average compound AUC. The p-values are from linear regressions (red lines). Dashed lines correspond to the thresholds of no concordance (CCC = 0) and fair concordance (CCC = 20).
Figure 5
Figure 5
Graphical inspection of a drug-gene interaction involving the two compounds PLX4720 and Nutlin-3a, and the two alterations BRAF.V600E_MUT and TP53_MUT. Boxplots represent the median (central line), the mean (square), 25th–75th percentiles (box), and min-max non-outlier values (whiskers); p-values are from the t-test.
Figure 6
Figure 6
Graphical inspection of a drug-gene interaction involving the two compounds Dabrafenib and Afatinib (rescreen), and the two alterations BRAF.V600E_MUT and IKZF3_AMP. Boxplots represent the median (central line), the mean (square), 25th–75th percentiles (box), and min-max non-outlier values (whiskers); p-values are from the t-test.

Similar articles

Cited by

References

    1. Smida M., Nijman S.M. Functional Drug–Gene Interactions in Lung Cancer. Expert Rev. Mol. Diagn. 2012;12:291–302. doi: 10.1586/erm.12.16. - DOI - PubMed
    1. Barretina J., Caponigro G., Stransky N., Venkatesan K., Margolin A.A., Kim S., Wilson C.J., Lehár J., Kryukov G.V., Sonkin D., et al. The Cancer Cell Line Encyclopedia Enables Predictive Modelling of Anticancer Drug Sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. - DOI - PMC - PubMed
    1. Yang W., Soares J., Greninger P., Edelman E.J., Lightfoot H., Forbes S., Bindal N., Beare D., Smith J.A., Thompson I.R., et al. Genomics of Drug Sensitivity in Cancer (GDSC): A Resource for Therapeutic Biomarker Discovery in Cancer Cells. Nucleic Acids Res. 2013;41:D955–D961. doi: 10.1093/nar/gks1111. - DOI - PMC - PubMed
    1. Iorio F., Knijnenburg T.A., Vis D.J., Bignell G.R., Menden M.P., Schubert M., Aben N., Gonçalves E., Barthorpe S., Lightfoot H., et al. A Landscape of Pharmacogenomic Interactions in Cancer. Cell. 2016;166:740–754. doi: 10.1016/j.cell.2016.06.017. - DOI - PMC - PubMed
    1. Garnett M.J., Edelman E.J., Heidorn S.J., Greenman C.D., Dastur A., Lau K.W., Greninger P., Thompson I.R., Luo X., Soares J., et al. Systematic Identification of Genomic Markers of Drug Sensitivity in Cancer Cells. Nature. 2012;483:570–575. doi: 10.1038/nature11005. - DOI - PMC - PubMed

LinkOut - more resources