Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul;16(7):e9405.
doi: 10.15252/msb.20199405.

Drug mechanism-of-action discovery through the integration of pharmacological and CRISPR screens

Affiliations

Drug mechanism-of-action discovery through the integration of pharmacological and CRISPR screens

Emanuel Gonçalves et al. Mol Syst Biol. 2020 Jul.

Abstract

Low success rates during drug development are due, in part, to the difficulty of defining drug mechanism-of-action and molecular markers of therapeutic activity. Here, we integrated 199,219 drug sensitivity measurements for 397 unique anti-cancer drugs with genome-wide CRISPR loss-of-function screens in 484 cell lines to systematically investigate cellular drug mechanism-of-action. We observed an enrichment for positive associations between the profile of drug sensitivity and knockout of a drug's nominal target, and by leveraging protein-protein networks, we identified pathways underpinning drug sensitivity. This revealed an unappreciated positive association between mitochondrial E3 ubiquitin-protein ligase MARCH5 dependency and sensitivity to MCL1 inhibitors in breast cancer cell lines. We also estimated drug on-target and off-target activity, informing on specificity, potency and toxicity. Linking drug and gene dependency together with genomic data sets uncovered contexts in which molecular networks when perturbed mediate cancer cell loss-of-fitness and thereby provide independent and orthogonal evidence of biomarkers for drug development. This study illustrates how integrating cell line drug sensitivity with CRISPR loss-of-function screens can elucidate mechanism-of-action to advance drug development.

Keywords: CRISPR-Cas9; drug mechanism-of-action; protein networks.

PubMed Disclaimer

Conflict of interest statement

This work was funded in part by AstraZeneca. M.J.G. receives funding from AstraZeneca. M.J.G., F.I. and A.R.L. receive funding from Open Targets, a public–private initiative involving academia and industry. F.I. performs consulting for the CRUK‐AstraZeneca Functional Genomics Centre. J.T.L., B.S., C.C. and S.F. are current employees of AstraZeneca and hold stock in AstraZeneca.

Figures

Figure EV1
Figure EV1. Overview of the drug sensitivity data sets
  1. Histogram of the number of IC50 values measured per drug.

  2. Number of drugs measured per cell line in each pharmacological data set.

  3. PCA analysis of the drug response measurements separated by GDSC1 and GDSC2.

  4. Pearson correlation coefficient between each principal component (PC) and cell lines growth rate.

  5. Top absolutely correlated drugs with growth rate.

Figure EV2
Figure EV2. Overview of the CRISPR‐Cas9 data sets
  1. PCA analysis of the samples in the CRISPR‐Cas9 screens, samples institute of origin is highlighted.

  2. Correlation coefficients between all top 10 PCs and growth rate.

  3. Correlation between cell lines growth rate and PC3 (Pearson correlation coefficient reported in the top left).

Figure 1
Figure 1. Integration of drug and CRISPR gene dependencies in cancer cell lines
  1. Linear models were used to integrate drug sensitivity (IC50 values) and gene fitness measurements.

  2. Volcano plot showing the effect sizes and the P‐value for statistically significant associations, Benjamini–Hochberg false discovery rate (FDR)‐adjusted likelihood‐ratio test P‐value < 10%. Drug–gene associated pairs are coloured according to their shortest distance in a protein–protein interaction network of the gene to any of the nominal target of the drug.

  3. Percentage of the 358 drugs with significant associations and their shortest distance in the PPI network to the drug nominal targets. T represents drugs that have a significant association with at least one of their canonical targets, “−” represents no link was found, and X are those which have no significant association.

  4. Examples of the top drug response correlations with target gene fitness. Each point represents an individual cell line. MCL1_1284 and venetoclax are MCL1 and BCL2 selective inhibitors, respectively. Gene fitness log2 fold changes (FC) are scaled by using previously defined sets of essential (median scaled log2 FC = −1) and non‐essential (median scaled log2 FC = 0) genes. Drug response IC50 measurements are represented using the natural log (ln IC50).

  5. Kinobead affinity is significantly higher (lower pK d) for compounds with a significant association with their target (n = 20, Mann–Whitney P‐value = 3.1e‐07). Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

Figure EV3
Figure EV3. Drug response and gene fitness associations
  1. Total number of drugs utilised in the study and the different levels of information available: “All” represents all the drugs including replicates screened with different technologies (GDSC1 and GDSC2); “Unique” counts the number of unique drug names; “Annotated” shows the number of unique drugs with manual annotation of nominal targets; and “Target tested” represents the number of unique drugs, with target information, for which the target has been knocked‐out in the CRISPR‐Cas9 screens.

  2. Histogram of the drug–gene associations effect sizes (beta) highlighting drug–target associations.

  3. Distribution of the shortest path lengths between all the tested drug–gene pairs. For drugs with multiple targets the smallest shortest path of all the targets was taken. T represents the drug target and “−” represents no link was found.

  4. Distribution of number of drug targets found in an unsupervised way using the ChEMBL database. Drugs are grouped by having significant drug–target associations. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  5. Contingency matrix of significant drug associations with CRISPR fold changes and binarised event matrix of genomic features, i.e. mutations and copy number gain or loss.

  6. PLK1 inhibitors drug response correlation with PLK1 knockout log2 fold change (FC) gene fitness effects. The dashed grey line indicates the dose response highest drug concentration.

  7. Drug–target associations split by significance (FDR‐adjusted likelihood‐ratio test P‐value < 10%) plotted against the standard deviation of the drug IC50 (ln) measurements of the respective pair (significant “Yes” n = 129, significant “No” n = 684). Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  8. Similar to (F), correlation of olaparib drug response and both targets PARP1 and PARP2 gene fitness effects.

  9. Correlation heatmap of the drug–gene effect size across all the genes. Drugs are coloured according to their targets.

Figure 2
Figure 2. Drug response protein–protein networks
  1. A

    Distribution of the FDR‐adjusted P‐values (top) and count (bottom) of the significant (FDR‐adjusted likelihood‐ratio test P‐value < 10%) drug–gene (CRISPR) associations according to their distance between the gene and corresponding drug targets in the protein–protein interaction network. T represents drugs that have a significant association with at least one of their canonical targets and “−” represents no link was found. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  2. B

    Similar to (A), but instead gene expression (GExp) was tested to identify associations with drug response. T represents drugs that have a significant association with at least one of their canonical targets and “−” represents no link was found. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  3. C, D

    (C), Representative example, i.e. cetuximab—EGFR inhibitor, of the associations and (D), networks that can be obtained from the integrative analysis. Edges in the network are weighted with the Pearson correlation coefficient obtained between the fitness profiles of interacting nodes. For representation purposes only edges with the highest correlation coefficient were represented, R 2 > 0.3. Nodes with orange borders represent significant associations with drug response, cetuximab.

  4. E

    Drug–target associations grouped by statistical significance (FDR‐adjusted likelihood‐ratio test P‐value < 10%) and plotted against the standard deviation of the drug–target CRISPR fold changes (significant “Yes” n = 129, significant “No” n = 684). Upper and lower dashed lines represent the standard deviations of essential and non‐essential genes, respectively. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

Figure 3
Figure 3. Representative examples of top most significantly associated drug classes
Each bar plot group represents a unique drug where genes are ranked by statistical significance (likelihood‐ratio test P‐value) of their association. Effect sizes of the associations are reported under the bars along the x axis. Shortest distance (number of interactions) in a protein–protein interaction network between the gene and the drug nominal target(s) is represented on the top of the bars, where T and orange bar represent the target and ““−” represents no link was found.
Figure 4
Figure 4. MCL1 inhibitors associations
  1. Top three CRISPR associations with all MCL1 inhibitors screened. Each bar represents the likelihood‐ratio test P‐value of each drug–gene (CRISPR) association and effect sizes reported under the bars. Shortest distance (number of interactions) in a protein–protein interaction network between the gene and the drug nominal target(s) is represented on the top of the bars, where T and orange bar represent the target and “−” represents no link was found.

  2. Association between the gene fitness profiles of MCL1 and MARCH5.

  3. Stratification of the MCL1 inhibitor sensitivity according to the essentiality profile of MCL1 and MARCH5, where MCL1 + MARCH5 represents a cell line that is independently dependent on both genes. Dashed orange line (left) represents the mean IC50 in acute myeloid leukaemia cell lines. Grey dashed line (right) represents the maximum concentration used in the dosage response curve. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  4. BCL2, BCL2L1 and MCL1 inhibitors and the respective association with their targets, on the x axis with CRISPR gene fitness and on the y axis with gene expression. The statistical significance (FDR‐adjusted likelihood‐ratio test P‐value < 10%) of the association is represented with a backward slash for CRISPR and forward slash for GEXP.

  5. Regularised multilinear regression to predict drug response of all MCL1 inhibitors using gene expression, fitness or both of known regulators of the BCL2 family and MARCH5. Predictive performance is estimated using R 2 metric represented in the x axis.

  6. Effect size of each feature used in each MCL1 inhibitor model.

Figure EV4
Figure EV4. MCL1 inhibitors
  1. Correlation of all MCL1 inhibitor IC50 values against MCL1 and MARCH5 gene fitness profiles. Effect sizes (b) and FDR (P) of the association are reported on the bottom.

  2. Stratification of the MCL1 inhibitors drug response measurements according to the cell line dependency on MARCH5 and/or MCL1. Gene vulnerabilities are independent from each other, meaning knockouts were introduced independently and not at the same time. Responses are then split according to the cancer type of the cell lines. Number of measurements per boxplot are: acute myeloid leukaemia (MCL1 = 3, MCL1 + MARCH5 = 2, None = 1), breast carcinoma (MARCH5 = 2, MCL1 = 9, MCL1 + MARCH5 = 4, None = 14) and Other (MARCH5 = 9, MCL1 = 52, MCL1 + MARCH5 = 21, None = 236). Vulnerable cell lines to MARCH5 and MCL1 knockout were defined as those with a depletion of at least 50% of that visible for essential genes on the particular cell line (scaled log2 fold change < −0.5). Grey dashed line represents the maximum concentration used in the dosage response curve of the respective drug. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  3. Representative example of a MCL1 inhibitor and their relation with MCL1 gene fitness, with cell lines containing copy number amplification of MCL1 highlighted in orange. Copy number amplified cells were defined taking into consideration their ploidy status, cells with (ploidy ≤ 2.7 and copy number ≥ 5) or (ploidy > 2.7 and copy number ≥ 9) were considered as having MCL1 amplified. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

Figure 5
Figure 5. Robust pharmacological associations
  1. Diagram representing how genomic and gene expression data sets are integrated to identify significant associations with drug–gene pairs that were previously found to be significantly correlated.

  2. Number of drug–gene pairs with at least one significant association with drug response, gene fitness or both, considering either genomic or gene expression profiles.

  3. Canonical examples of robust pharmacological associations. Box‐and‐whisker plots show 1.5× interquartile ranges and 5–95th percentiles, centres indicate medians.

  4. Representative example of a BIRC2/BIRC3 inhibitor, IAP_5620, showing the significant associations (FDR‐adjusted likelihood‐ratio test P‐value < 10%) with CRISPR gene fitness profiles and their location in a representation of the TNF pathway. Bar plot is defined similarly to Fig 3.

Figure EV5
Figure EV5. Robust pharmacological associations
  1. A

    Most frequent genomic alterations across the cancer cell lines.

  2. B, C

    Most significant associations between (B), genomic features and (C), gene expression profiles with drug response and gene fitness.

  3. D

    Number of significant drug–gene pairs across the different types of interactions. Drug–gene pairs were categorised considering the shortest path length between the drug target(s) and the associated gene. T represents the drug target and “−” represents no link was found.

  4. E

    Robust pharmacological association between the expression of BCL2L1 and the significantly correlated pair of MCL1_1284 drug and MCL1 gene fitness profile.

  5. F

    Similarly to (E), but instead it represents a robust pharmacological association between BAX and MDM2 and Nutlin‐3a.

Comment in

Similar articles

Cited by

References

    1. Aguirre AJ, Meyers RM, Weir BA, Vazquez F, Zhang C‐Z, Ben‐David U, Cook A, Ha G, Harrington WF, Doshi MB et al (2016) Genomic copy number dictates a gene‐independent cell response to CRISPR/Cas9 targeting. Cancer Discov 6: 914–929 - PMC - PubMed
    1. Antolin AA, Ameratunga M, Banerji U, Clarke PA, Workman P, Al‐Lazikani B (2020) The kinase polypharmacology landscape of clinical PARP inhibitors. Sci Rep 10: 2585 - PMC - PubMed
    1. Bantscheff M, Eberhard D, Abraham Y, Bastuck S, Boesche M, Hobson S, Mathieson T, Perrin J, Raida M, Rau C et al (2007) Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitors. Nat Biotechnol 25: 1035–1044 - PubMed
    1. Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D et al (2012) The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483: 603–607 - PMC - PubMed
    1. Behan FM, Iorio F, Picco G, Gonçalves E, Beaver CM, Migliardi G, Santos R, Rao Y, Sassi F, Pinnelli M et al (2019) Prioritization of cancer therapeutic targets using CRISPR‐Cas9 screens. Nature 568: 511–516 - PubMed

Publication types

MeSH terms