Skip to main page content
U.S. flag

An official website of the United States government

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Feb 27;57(2):115-121.
doi: 10.1021/acs.jcim.6b00686. Epub 2017 Feb 14.

3D-e-Chem-VM: Structural Cheminformatics Research Infrastructure in a Freely Available Virtual Machine

Affiliations

3D-e-Chem-VM: Structural Cheminformatics Research Infrastructure in a Freely Available Virtual Machine

Ross McGuire et al. J Chem Inf Model. .

Abstract

3D-e-Chem-VM is an open source, freely available Virtual Machine ( http://3d-e-chem.github.io/3D-e-Chem-VM/ ) that integrates cheminformatics and bioinformatics tools for the analysis of protein-ligand interaction data. 3D-e-Chem-VM consists of software libraries, and database and workflow tools that can analyze and combine small molecule and protein structural information in a graphical programming environment. New chemical and biological data analytics tools and workflows have been developed for the efficient exploitation of structural and pharmacological protein-ligand interaction data from proteomewide databases (e.g., ChEMBLdb and PDB), as well as customized information systems focused on, e.g., G protein-coupled receptors (GPCRdb) and protein kinases (KLIFS). The integrated structural cheminformatics research infrastructure compiled in the 3D-e-Chem-VM enables the design of new approaches in virtual ligand screening (Chemdb4VS), ligand-based metabolism prediction (SyGMa), and structure-based protein binding site comparison and bioisosteric replacement for ligand design (KRIPOdb).

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
KNIME workflows to exploit cheminformatics and bioinformatics information on GPCRs (GPCRdb nodes) and protein kinases (KLIFS nodes). In the GPCRdb workflow, KNIME nodes are used to enable the extraction and combination of protein information, sequence, alternative numbering schemes, mutagenesis data, and experimental structures for a selected receptor from GPCRdb. The lower branch of the workflow returns all sequence identities and similarities of the TM domain for the selected receptors and can be used for further structural chemogenomics analyses using, e.g., structural and structure-based sequence alignments of the ligand binding site residues of crystallized aminergic receptors (available in the VM as a PyMOL session). In the KLIFS workflow, KNIME nodes enable the integrated analysis of structural kinase–ligand interactions from all structures for a specific kinase in KLIFS (human MAPK in the example). Kinase–ligand complexes with a specific hydrogen bond interaction pattern between the ligand and residues in the hinge region of the kinase (stacked bar chart) are selected for an all-against-all comparison of their structural kinase–ligand interactions fingerprints (heat map). The ligands from the selected structures are compared and the ligand pair with the lowest chemical similarity and a high interaction fingerprint similarity are retrieved from KLIFS for binding mode comparison. Meta nodes in the workflows in panels A and B are indicated with a star (*). The full workflows are provided in the Supporting Information, Figures S2 and S3.
Figure 2
Figure 2
KRIPO binding site similarity based bioisosteric replacement and SyGMa metabolite prediction workflows. Ligands in KRIPOdb that share a chemical (sub)structure with a specified molecule (doxepin in the example) are identified and defined as query fragment(s). Ligand (fragment) binding site hits that share pharmacophore fingerprint similarity with the binding site(s) associated with the query fragment(s) (e.g., the doxepin binding site of the histamine H1 receptor) are identified and ranked according to Tanimoto similarity score. The occurrence of protein targets in the top hit list is analyzed. The pharmacophore overlay underlying the similarity value of an example hit (histamine methyltransferase, PDB ID: 2aot; available in the VM as a PyMOL session). The full workflow is provided in the Supporting Information (Figure S4). In the SyGMa workflow Smiles strings of clozapine and dasatinib are converted into RDKit molecules for the prediction of metabolites using the SyGMa Metabolites node, filtered based on a SyGMa_score threshold of 0.1. The two tables are subsections of the resulting table, showing the top ranked metabolites of clozapine and dasatinib, consistent with experimental metabolism data., Meta nodes are indicated with a star (*).
Figure 3
Figure 3
Schematic diagram of possible interactions of the 3D-e-Chem-VM virtual machine elements: KLIFS and GPCRdb web service connector nodes, KRIPOdb, KRIPO, and SyGMa nodes, and the Chemdb4VS workflow (full workflow presented in the Supporting Information, Figure S6) integrated in a GPCR-kinase cross-reactivity prediction workflow.

Similar articles

Cited by

References

    1. Hu Y.; Bajorath J. Learning from ’big data’: compounds and targets. Drug Discovery Today 2014, 19, 357–60. 10.1016/j.drudis.2014.02.004. - DOI - PubMed
    1. Lusher S. J.; McGuire R.; van Schaik R. C.; Nicholson C. D.; de Vlieg J. Data-driven medicinal chemistry in the era of big data. Drug Discovery Today 2014, 19, 859–68. 10.1016/j.drudis.2013.12.004. - DOI - PubMed
    1. RDKit. http://www.rdkit.org.
    1. Steinbeck C. C.; Han Y.; Kuhn S.; Horlacher O.; Luttmann E.; Willighagen E. The Chemistry Development Kit. J. Chem. Inf. Comput. Sci. 2003, 43, 493–500. 10.1021/ci025584y. - DOI - PMC - PubMed
    1. Jmol. http://jmol.sourceforge.net/.

Publication types

LinkOut - more resources