Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Mar 8:2024.03.06.24303800.
doi: 10.1101/2024.03.06.24303800.

Predicting drug outcome of population via clinical knowledge graph

Affiliations

Predicting drug outcome of population via clinical knowledge graph

Maria Brbić et al. medRxiv. .

Abstract

Optimal treatments depend on numerous factors such as drug chemical properties, disease biology, and patient characteristics to which the treatment is applied. To realize the promise of AI in healthcare, there is a need for designing systems that can capture patient heterogeneity and relevant biomedical knowledge. Here we present PlaNet, a geometric deep learning framework that reasons over population variability, disease biology, and drug chemistry by representing knowledge in the form of a massive clinical knowledge graph that can be enhanced by language models. Our framework is applicable to any sub-population, any drug as well drug combinations, any disease, and to a wide range of pharmacological tasks. We apply the PlaNet framework to reason about outcomes of clinical trials: PlaNet predicts drug efficacy and adverse events, even for experimental drugs and their combinations that have never been seen by the model. Furthermore, PlaNet can estimate the effect of changing population on the trial outcome with direct implications on patient stratification in clinical trials. PlaNet takes fundamental steps towards AI-guided clinical trials design, offering valuable guidance for realizing the vision of precision medicine using AI.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Overview of the PlaNet framework.
PlaNet is built as a massive clinical knowledge graph (KG) that captures treatment information as well as underlying biology and chemistry. (a) The core of the PlaNet framework is a clinical KG that represents knowledge in the form of (drug, disease, population) triplets. These entities are then linked to external knowledge bases: diseases to Medical Subject Headings (MeSH) vocabulary [59], treatments to DrugBank database [19], and population properties to Unified Medical Language System (UMLS) terms [60]. (b) We integrate 11 biological and chemical databases to capture knowledge of disease biology and drug chemistry, such as databases of drug structural similarities, drug targets, disease-perturbed proteins, protein interactions and protein functional relations (Methods). These databases are integrated with the UMLS graph that captures population relations. (c) Instantiation of the PlaNet framework on the clinical trials data. We parse and standardize clinical trials database and extract information about diseases, drug treatments, eligibility criteria terms and primary outcomes. (d) Final KG is obtained by integrating the clinical KG (c) with biological and chemical networks (b).
Figure 2:
Figure 2:. PlaNet reasons about efficacy of drugs in clinical trials even for experimental drugs that have never been tested before.
(a) UMAP space of all trial arm embeddings in the clinical trials database obtained by pretraining PlaNet on the self-supervised task (Methods). Arms are colored according to disease information. Only major disease groups according to MeSH hierarchy [59] are shown. Grey color denotes minor disease groups. The arm embeddings learned by PlaNet exhibit clustering according to disease groups. (b) Given embeddings of two trial arms to which different drug treatments were applied, PlaNet predicts which of the treatments is more effective. Methodologically, the method geometric deep learning model is fine-tuned on the efficacy prediction task by using information about drug efficacy from the completed clinical trials. (c) Performance comparison of PlaNet with disease-drug-outcome (DDO) classifier and transformer-based language model BERT [24, 25]. PlaNetLM is obtained by augmenting PlaNet with the text embedding of the trial arm protocol [29] (Methods). Performance is measured as the mean area under receiver operating characteristic curve (AUROC) score across 10 runs of each model on different test data samples. Error bars are 95% bootstrap confidence intervals. (d) Effect of the training set size on the performance. With more training data, PlaNet substantially improves performance strongly indicating that further improvements can be expected by increasing the size of the training set. Performance is measured as the mean AUROC score across 10 runs on different test data samples. Error bars are 95% bootstrap confidence intervals. (e) PlaNet predicts efficacy of novel, experimental drugs that have never been seen in a clinical trial before. Bars represent the mean AUROC score for drugs that have been seen in the labeled training data (left; blue color), and never-before-seen drugs (right; grey color). Mean performance is computed across 10 runs of different test data samples and error bars are 95% bootstrap confidence intervals. (f, g) Examples of correct predictions. PlaNet outputs probabilities that a particular treatment will lead to higher overall survival of the population. (f) PlaNet correctly predicted higher overall survival of melanoma patients in paclitaxel arm compared to tasisulam-sodium arm. The model has never before seen any effect (labeled example) of the tasisulam-sodium drug. (g) PlaNet correctly predicted higher progression free survival of melanoma patients when given combination of dabrafenib and trametinib drugs compared to trametinib drug alone. The model has never before seen any effect of dabrafenib or trametinib drugs.
Figure 3:
Figure 3:. PlaNet reasons about safety of clinical trials.
(a) Given a trial arm embedding, PlaNet predicts (b) whether a serious adverse event will occur and (c) what adverse event will happen. Methodologically, the methodolog geometric deep learning model is fine-tuned on the safety task by using information about drug safety from the completed clinical trials. (b) Performance of PlaNet on predicting occurrence of serious adverse events. PlaNet achieves AUROC score of 0.79 on predicting whether serious adverse event will occur. Green curve shows performance on all trials, while orange curve shows performance on on trials that do not investigate cancer diseases. (c) Performance of PlaNet on predicting exact category of adverse events measured as AUROC score. We consider 554 adverse events defined as preferred terms (PT) in MedDRA hierarchy [39] and group them according to the organ level categories. We consider organ level categories with at least 20 PT terms. The boxes show the quartiles of the performance distribution across different adverse events. Whiskers show the rest of the distribution. (d) Performance of PlaNet on predicting adverse events of future clinical trials. PlaNet achieves similar performance on predicting outcome of future clinical trials when compared to trials that are randomly split into train and test dataset independent of the year in which they were conducted. The performance is measured using AUROC and boxes show quartiles of the AUROC distribution across different adverse events. Whiskers show the rest of the distribution. (e, f) Examples of individual predictions of adverse events. Model assigns probability that an adverse event will be enriched in a given arm compared to no-treatment arm (Methods). (e) In an everolimus safety trial for tuberous sclerosis complex with refractory partial-onset seizures, PlaNet correctly predicted pneumonia as an adverse event with a high confidence. Although pneumonia is a very rare adverse event of everolimus [40], in this trial pneumonia was reported as a very common adverse event with one patient dying from pneumonia, which was suspected to be treatment-related [41]. (f) In a lenvatinib safety trial for thyroid cancer patients, PlaNet correctly predicted uncontrolled hypertension as an adverse event. Uncontrolled hypertension was reported as the most frequent adverse event in that trial [42].
Figure 4:
Figure 4:. PlaNet identifies characteristics of populations that are at risk of developing adverse events.
(a) We match clinical trials that study same drug, same disease and have same primary outcome (PO), but differ in the characteristics of the eligible population and result in different adverse events, i.e., adverse event was observed in one trial, but not in the other. For pairs of such clinical trials, we assess whether model correctly adjusted prediction of an adverse event and predicted higher probability of an adverse event in one trial compared to the other. (b) Percentage of matched trials on which PlaNet correctly adjusted the probability of an adverse event (orange color; left) and percentage on which the adjustment was wrong (green color; right). PlaNet makes 10 times more correct adjustments than wrong. We count pairs only if the difference between probability of adverse event occurrence of two matched trials is at least 0.2. (c) The effect of the probability difference threshold on the ratio of correct and wrong probability adjustments. Even with smaller difference in probabilities (at least 0.05), the number of correct adjustments is more than 4 times higher than the number of wrong adjustments. With the difference of at least 0.4 the number of correct adjustments is 90 times higher than the number of wrong adjustments. For each probability threshold p, we count matched trials as correct or wrong only if the difference between probabilities is at least p. (d) PlaNet identifies population characteristics whose exclusion can reduce probability of adverse events. Given a population property, we estimate prior probability of an adverse event when population with a given property is included in the trial. We then change the trial by excluding population with that property, and observe the change in adverse event probability Δ. By ranking terms according to probability score, we can identify population properties whose exclusion can increase safety of clinical trials. (e) Use case of (d) for a trial that tests exemestane drug for breast neoplasms and in which breathing difficulty was observed as an adverse event. PlaNet finds population properties that have the highest effect on causing breathing difficulty. By excluding that population from the trial, PlaNet suggests that the probability of breathing difficulty can be significantly reduced. We rank terms that belong to drug, disease and procedure categories.

Similar articles

References

    1. Ramamoorthy A., Pacanowski M., Bull J. & Zhang L. Racial/ethnic differences in drug disposition and response: review of recently approved drugs. Clinical Pharmacology & Therapeutics 97, 263–273 (2015). - PubMed
    1. Schork N. J. Personalized medicine: time for one-person trials. Nature 520, 609–611 (2015). - PubMed
    1. Charles H., Good C. B., Hanusa B. H., Chang C.-C. H. & Whittle J. Racial differences in adherence to cardiac medications. Journal of the National Medical Association 95, 17 (2003). - PMC - PubMed
    1. Siegel K., Karus D. & Schrimshaw E. Racial differences in attitudes toward protease inhibitors among older HIV-infected men. AIDS care 12, 423–434 (2000). - PubMed
    1. Liu K. A. & Dipietro Mager N. A. Women’s involvement in clinical trials: historical perspective and future implications. Pharmacy Practice (Granada) 14, 0–0 (2016). - PMC - PubMed

Publication types