Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Mar 1;18(1):140.
doi: 10.1186/s12859-017-1546-7.

Positive-Unlabeled Learning for inferring drug interactions based on heterogeneous attributes

Affiliations

Positive-Unlabeled Learning for inferring drug interactions based on heterogeneous attributes

Pathima Nusrath Hameed et al. BMC Bioinformatics. .

Abstract

Background: Investigating and understanding drug-drug interactions (DDIs) is important in improving the effectiveness of clinical care. DDIs can occur when two or more drugs are administered together. Experimentally based DDI detection methods require a large cost and time. Hence, there is a great interest in developing efficient and useful computational methods for inferring potential DDIs. Standard binary classifiers require both positives and negatives for training. In a DDI context, drug pairs that are known to interact can serve as positives for predictive methods. But, the negatives or drug pairs that have been confirmed to have no interaction are scarce. To address this lack of negatives, we introduce a Positive-Unlabeled Learning method for inferring potential DDIs.

Results: The proposed method consists of three steps: i) application of Growing Self Organizing Maps to infer negatives from the unlabeled dataset; ii) using a pairwise similarity function to quantify the overlap between individual features of drugs and iii) using support vector machine classifier for inferring DDIs. We obtained 6036 DDIs from DrugBank database. Using the proposed approach, we inferred 589 drug pairs that are likely to not interact with each other; these drug pairs are used as representative data for the negative class in binary classification for DDI prediction. Moreover, we classify the predicted DDIs as Cytochrome P450 (CYP) enzyme-Dependent and CYP-Independent interactions invoking their locations on the Growing Self Organizing Map, due to the particular importance of these enzymes in clinically significant interaction effects. Further, we provide a case study on three predicted CYP-Dependent DDIs to evaluate the clinical relevance of this study.

Conclusion: Our proposed approach showed an absolute improvement in F1-score of 14 and 38% in comparison to the method that randomly selects unlabeled data points as likely negatives, depending on the choice of similarity function. We inferred 5300 possible CYP-Dependent DDIs and 592 CYP-Independent DDIs with the highest posterior probabilities. Our discoveries can be used to improve clinical care as well as the research outcomes of drug development.

Keywords: CYP isoforms; Drug-drug interaction; Growing self organizing map (GSOM); PU learning; Pairwise drug similarity.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
This diagram illustrates the main idea behind Positive-Unlabeled Learning. a Available data. b Goal
Fig. 2
Fig. 2
This diagram illustrates the proposed methodology and our three main contributions for inferring DDIs, integrating Similarity Feature Representation1 (SFR1) and Similarity Feature Representation2 (SFR2)
Fig. 3
Fig. 3
Pseudo-code for profiling GSOM nodes as ‘positive/negative/ambiguous’ node
Fig. 4
Fig. 4
Example of deriving similarity metrics for drug association. Jaccard Index is the frequently used approach while Individual Similarity function is the proposed function
Fig. 5
Fig. 5
a The average within cluster distance (AWCD) using Similarity Feature Representation 1 and (b) Number of GSOM nodes variation for Similarity Feature Representation 1
Fig. 6
Fig. 6
GSOM maps for DDI data: (a) shows the GSOM map for Similarity Feature Representation 1 (SFR1) when Spread Factor=0.1 and it contains 919 nodes; (b) shows the GSOM map for Similarity Feature Representation 2 (SFR2) when Spread Factor= 10−15 and it contains 922 nodes. The nodes shown in blue are the proposed negative nodes having only unlabeled instances, the nodes shown in grey contains both initial positives and unlabeled instances, and the nodes shown in red contains only initial positives

Similar articles

Cited by

References

    1. Cheng F, Zhao Z. Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. J Am Med Inform Assoc. 2014;21(e2):278–86. doi: 10.1136/amiajnl-2013-002512. - DOI - PMC - PubMed
    1. Ai N, Fan X, Ekins S. In silico methods for predicting drug-drug interactions with cytochrome p-450s, transporters and beyond. Adv Drug Deliv Rev. 2015;86:46–60. doi: 10.1016/j.addr.2015.03.006. - DOI - PubMed
    1. Snyder BD, Polasek TM, Doogue MP. Drug interactions: principles and practice. Aust Prescr. 2012;35(3):85–8. doi: 10.18773/austprescr.2012.037. - DOI
    1. Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, Maciejewski A, Arndt D, Wilson M, Neveu V, et al. Drugbank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2014;42(D1):1091–097. doi: 10.1093/nar/gkt1068. - DOI - PMC - PubMed
    1. DrugBank. DrugBank Stat. http://www.drugbank.ca/stats. Accessed 31 Mar 2016.

MeSH terms

Substances