Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 15;38(30):2641-2663.
doi: 10.1002/jcc.25052. Epub 2017 Sep 22.

Customizable de novo design strategies for DOCK: Application to HIVgp41 and other therapeutic targets

Affiliations

Customizable de novo design strategies for DOCK: Application to HIVgp41 and other therapeutic targets

William J Allen et al. J Comput Chem. .

Abstract

De novo design can be used to explore vast areas of chemical space in computational lead discovery. As a complement to virtual screening, from-scratch construction of molecules is not limited to compounds in pre-existing vendor catalogs. Here, we present an iterative fragment growth method, integrated into the program DOCK, in which new molecules are built using rules for allowable connections based on known molecules. The method leverages DOCK's advanced scoring and pruning approaches and users can define very specific criteria in terms of properties or features to customize growth toward a particular region of chemical space. The code was validated using three increasingly difficult classes of calculations: (1) Rebuilding known X-ray ligands taken from 663 complexes using only their component parts (focused libraries), (2) construction of new ligands in 57 drug target sites using a library derived from ∼13M drug-like compounds (generic libraries), and (3) application to a challenging protein-protein interface on the viral drug target HIVgp41. The computational testing confirms that the de novo DOCK routines are robust and working as envisioned, and the compelling results highlight the potential utility for designing new molecules against a wide variety of important protein targets. © 2017 Wiley Periodicals, Inc.

Keywords: DOCK; ZINC; chemical space; de novo design; drug discovery; footprint similarity; fragment libraries; scoring functions; structure-based design.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of docking/VS (top) versus de novo design (bottom) approaches. In VS, ligands originate from existing chemical libraries or catalogs. In de novo design, fragments are building blocks used to construct new ligands. Receptors are typically treated the same in both methods.
Figure 2
Figure 2
Small molecule 2-deoxy-2,3-didehydro-N-acetylneuraminic acid (DANA, left) is deconstructed into fragments (right) around rotatable bonds. Redundant fragments are removed. Attachment points are marked as dummy atoms (Du).
Figure 3
Figure 3
Flowchart of iterative growth process for de novo DOCK. Input includes grids and parameters according to standard DOCK run and scoring functions, and fragment libraries. Anchors are oriented to binding site. Retained anchors are grown outward in a layer-by-layer approach analogous to anchor-and-grow. Complete molecules are written to file.
Figure 4
Figure 4
Scoring drives targeted growth (vertical blue arrow) and pruning/filtering (horizontal red arrow) in de novo design. Examples of targeted growth scoring functions include the DOCK grid energy score, footprint similarity score, and pharmacophore score. Examples of pruning/filtering criteria include Hungarian RMSD, molecular weight, number of rotatable bonds, scaffolds per layer, etc.
Figure 5
Figure 5
Scores for all molecules constructed in all 663 receptors using SGE (left), FPS (center), and MGE+FPS (right) directed ensembles (y-axis) and as a function of layer in which each molecule completed growth (x-axis).
Figure 6
Figure 6
Comparison between de novo built molecules (gray) and crystallographic references (orange) for results generated using the MGE+FPS scoring function for focused libraries. Molecules are organized by increasing number of distinct fragments. Shown are the PDB code, #distinct fragments, number of rotatable bonds in parenthesis, Tanimoto coefficient between the constructed molecule and reference, and Hungarian RMSD between common heavy atoms. Underlined RMSD values represent systems with Tanimoto not equal to 1.
Figure 7
Figure 7
The 14 most frequently observed (a) sidechains, (b) linkers, and (c) scaffolds from the generic fragment library. The number below each fragment in (parentheses) indicates the number of occurrences in ~13M ZINC drug-like molecules. Purple atoms with labels indicate attachment points and their accompanying Sybyl mol2 bond types. Other atoms are colored as: white=hydrogen, tan=carbon, red=oxygen, blue=nitrogen, yellow=sulfur, green=fluorine, bright green=chlorine. Double and aromatic bonds are not shown. The entries marked with * denote the 15 most common fragments and were used as anchors to seed de novo growth.
Figure 8
Figure 8
Scatter plot of MGE+FPS score vs re-docking RMSD for molecules constructed from generic fragment libraries. (Blue dots N=486,723) All molecules – 57 receptors, average ensemble size = 8,589. (Green dots N=2850) Top-scoring molecules – 57 receptors, top 50 molecules from each ensemble.
Figure 9
Figure 9
Molecular weight, number of rotatable bonds, and formal charge histograms for de novo designed molecules using generic libraries. Complete ensembles in blue and the top 50 best scoring molecules are in green.
Figure 10
Figure 10
Comparison of (a) Molecular Weight, (b) LogP, (c) Rsynth synthetic feasibility, (d) Lipinski violations, (e) number of H-bond acceptors, and (f) number of H-bond donors between de novo unique compounds (N= 489,573) constructed from generic libraries (blue) and 500K purchasable drug-like molecules from the ZINC database (red).
Figure 11
Figure 11
Comparison of molecular makeup between de novo built (~500K) and purchasable (~13M) molecules from ZINC. (a) Correlation plot of the three fragment types per molecule broken up by number of rotatable bonds. (b) Relative frequency of top-50 occurring fragments in ZINC (red) and de novo (blue) molecules. (c) Scatter plot of relative frequencies for the top-50 occurring fragments (black) and all other fragments (gray).
Figure 12
Figure 12
(a) Co-crystallized cyclic urea inhibitor shown making key interactions (purple) with HIV Protease (PDB: 1DMP). (b,c) Top scoring de novo designed congeneric series of molecules.
Figure 13
Figure 13
Example final outcomes for de novo construction of new molecules in important drug targets. Targets shown include (a) neuraminidase, (b) HIV protease, (c) HIV reverse transcriptase, (d) IGF-1R, (e) COX-1, and (f) acetylcholinesterase. PDB IDs are shown in parentheses next to the target name. Cognate crystal ligands are shown as blue sticks, candidate designed molecules are shown as orange sticks, and protein residues are shown as gray sticks. Key interactions, including hydrogen bonds, are shown as purple springs. The top three panels (a–c) shown multiple similar outcomes (5 overlaid orange molecules each). The bottom three panels (d–f) show a single outcome overlaid with the crystal structure.
Figure 14
Figure 14
Examples of molecules constructed in the HIV gp41 hydrophobic pocket through de novo design using the MGE+FPS scoring function and rescored using five different functions: MGE+FPS, SGE, FPSSUM, FPSVDW, and FPSES. Top panels show the top 200 scoring molecules for each ranking metric. Middle panels show the best scoring molecule for each ranking metric. Bottom panel radar plots show reference footprints (bold lines) for comparison with candidate footprints (dashed line) from the best scoring molecules. Hydrophobic pocket of gp41 in gray surface with key lysine residue highlighted in purple. Footprint similarity scores (Euclidian distance) for van der Waals (VDW) overlap and electrostatic overlap (ES) are shown in black and red font respectively.
Figure 15
Figure 15
Fraction of designed molecules targeting the gp41 hydrophobic pocket (y-axis) for which an analog was found in ZINC at or above a certain Tanimoto cutoff (x-axis).
Figure 16
Figure 16
Comparison of 15 de novo compounds (a–o) designed in the hydrophobic pocket of gp41 with the most similar compound available for purchase in ZINC. The designed compounds (top) have prefix 1AIK and an associated energy score (MGE+FPS function, kcal/mol) while the purchasable compounds (bottom) have prefix ZINC and an accompanying Tanimoto coefficient.
Figure 17
Figure 17
(a) Comparison of experimentally verified HIV gp41 inhibitor NB-2 (gray box) to 12 members of a congeneric series of de novo designed compounds. The de novo molecule identifier, ZINC ID of the purchasable NB-2 analog, energy score, and pose classifier are listed for each compound. (b) This congeneric series adopted 3 related poses (labeled 1–3). HIV gp41 receptor shown as gray surface, key chelating Lys 29 residue shown as purple surface and sticks, de novo designed molecules shown as green sticks.

Similar articles

Cited by

References

    1. Shoichet BK. Nature. 2004;432:862–865. - PMC - PubMed
    1. Jorgensen WL. Science. 2004;303:1813–1818. - PubMed
    1. Lipinski C, Hopkins A. Nature. 2004;432:855–861. - PubMed
    1. Coleman RG, Carchia M, Sterling T, Irwin JJ, Shoichet BK. PLoS One. 2013;8:e75992. - PMC - PubMed
    1. Allen WJ, Balius TE, Mukherjee S, Brozell SR, Moustakas DT, Lang PT, Case DA, Kuntz ID, Rizzo RC. J Comput Chem. 2015;36:1132–1156. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources