Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Feb 17;2(2):100171.
doi: 10.1016/j.crmeth.2022.100171. eCollection 2022 Feb 28.

Extracting functional insights from loss-of-function screens using deep link prediction

Affiliations

Extracting functional insights from loss-of-function screens using deep link prediction

Pieter-Paul Strybol et al. Cell Rep Methods. .

Abstract

We present deep link prediction (DLP), a method for the interpretation of loss-of-function screens. Our approach uses representation-based link prediction to reprioritize phenotypic readouts by integrating screening experiments with gene-gene interaction networks. We validate on 2 different loss-of-function technologies, RNAi and CRISPR, using datasets obtained from DepMap. Extensive benchmarking shows that DLP-DeepWalk outperforms other methods in recovering cell-specific dependencies, achieving an average precision well above 90% across 7 different cancer types and on both RNAi and CRISPR data. We show that the genes ranked highest by DLP-DeepWalk are appreciably more enriched in drug targets compared to the ranking based on original screening scores. Interestingly, this enrichment is more pronounced on RNAi data compared to CRISPR data, consistent with the greater inherent noise of RNAi screens. Finally, we demonstrate how DLP-DeepWalk can infer the molecular mechanism through which putative targets trigger cell line mortality.

Keywords: CRISPR screening; PPI networks; bioinformatics; cancer cell lines; deep learning; drug targets; functional screening; link prediction; machine learning; systems biology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Performance benchmark of several state-of-the-art LP methods on retrieving cancer dependencies from different cancer types (A and B) Average precision (AP) of LP methods in predicting cell line-gene interactions, based on (A) RNAi- or (B) CRISPR-derived screening scores. Note that the cancer types are listed in ascending order of the number of available cell lines per cancer type. The final column is the AP trained on all cancer types combined.
Figure 2
Figure 2
Discrepancy between RNAi dependency and drug sensitivity scores (A and B) Distribution of drug sensitivity scores for each RNAi dependency type, specific to (A) lung and (B) bladder cancer. The x axis shows for all known drug targets the cell line-gene interactions binned in 3 categories according to the RNAi dependency score: extremely weak, intermediary, and extremely strong (see STAR Methods). The y axis shows drug sensitivities in the same cell lines in which the dependencies occur. Lower drug sensitivities correspond to a stronger effect. For each category, the number of targets is indicated.
Figure 3
Figure 3
Relation between drug target retrieval and dependency prediction performance for all LP methods (A and B) AP of each method for cell line-gene dependency predictions and drug target retrieval using (A) RNAi or (B) CRISPR screening data. x axis: AP on correctly predicting a gene dependency on a cell line. y axis: AP on correctly labeling a gene as being a drug target. The horizontal dashed line represents the performance of the ranking based on original RNAi (black) or CRISPR (blue) screening scores in correctly retrieving a drug target. Each method is run 3 times using a different train and test set, and each repeat is shown as a separate dot.
Figure 4
Figure 4
Performance of each method in recovering benchmark drug targets in the top 100 prioritized genes per cell line as compared to random (A and B) This is assessed by showing the number of cell lines in which each method retrieves the targets (1) significantly better than random; (2) better, yet not significantly, than random; and (3) worse than random. The expected results were obtained by randomly sampling genes from the input graph using a scheme in which each gene has an equal chance of becoming selected—uniform (A) and based on a scheme in which a gene has a probability of being selected equal to its relative degree in the gene-gene interaction scaffold (B).
Figure 5
Figure 5
Distribution of the percentage of retrieved sensitive drug target in each of the 88 lung cancer cell lines Methods that retrieve significantly (Wilcoxon signed-rank test, FDR corrected p-value < 0.05) more benchmark drug targets (DLP-DeepWalk and GraRep) as compared to the original RNAi screening score are highlighted. The whiskers capture all data within 1.5 times the inter quartile range.
Figure 6
Figure 6
Genes that have more neighboring genes in the heterogeneous graph that are RNAi dependencies (x axis) are more likely to be found by DLP-DeepWalk than by the original RNAi data The y axis represents the difference in the number of cell lines in which a gene is correctly recovered as target, between DLP-DeepWalk and DepMap. The orange squares denote drug targets that are recovered in more cell lines by ranking on the original RNAi screening score, while blue dots are recovered more by ranking based on the probabilities provided by DLP-DeepWalk.
Figure 7
Figure 7
Subnetwork around known drug target KIF11 proxying the molecular mechanism through which it affects cell lines Genes connected by green edges are all first-order neighbors of KIF11 in the original STRING interaction network.

Similar articles

References

    1. Bastola P., Bilkis R., De Souza C., Minn K., Chien J. Heterozygous mutations in valosin - containing protein ( VCP ) and resistance to VCP inhibitors. Sci. Rep. 2019;9:11002. doi: 10.1038/s41598-019-47085-9. - DOI - PMC - PubMed
    1. Beskow A., Grimberg K.B., Bott L.C., Salomons F.A., Dantuma N.P., Young P. A conserved unfoldase activity for the p97 AAA-ATPase in proteasomal degradation. J. Mol. Biol. 2009;394:732–746. doi: 10.1016/j.jmb.2009.09.050. - DOI - PubMed
    1. Bortone K., Michiels F., Vandeghinste N., Tomme P., van Es H. Functional screening of viral siRNA libraries in human primary cells. DDW Drug Discov. World. 2004;5:20–28.
    1. Braschi B., Denny P., Gray K., Jones T., Seal R., Tweedie S., Yates B., Bruford E. Genenames.org: the HGNC and VGNC resources in 2019. Nucleic Acids Res. 2019;47:D786–D792. doi: 10.1093/nar/gky930. - DOI - PMC - PubMed
    1. Buitinck L., Louppe G., Blondel M., Pedregosa F., Mueller A., Grisel O., Niculae V., Prettenhofer P., Gramfort A., Grobler J., et al. API design for machine learning software: experiences from the Scikit-learn project. 2013. https://arxiv.org/abs/1309.0238

Publication types

LinkOut - more resources