Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Feb 4;27(3):71.
doi: 10.1007/s00894-021-04674-8.

Generative chemistry: drug discovery with deep learning generative models

Affiliations
Review

Generative chemistry: drug discovery with deep learning generative models

Yuemin Bian et al. J Mol Model. .

Abstract

The de novo design of molecular structures using deep learning generative models introduces an encouraging solution to drug discovery in the face of the continuously increased cost of new drug development. From the generation of original texts, images, and videos, to the scratching of novel molecular structures the creativity of deep learning generative models exhibits the height machine intelligence can achieve. The purpose of this paper is to review the latest advances in generative chemistry which relies on generative modeling to expedite the drug discovery process. This review starts with a brief history of artificial intelligence in drug discovery to outline this emerging paradigm. Commonly used chemical databases, molecular representations, and tools in cheminformatics and machine learning are covered as the infrastructure for generative chemistry. The detailed discussions on utilizing cutting-edge generative architectures, including recurrent neural network, variational autoencoder, adversarial autoencoder, and generative adversarial network for compound generation are focused. Challenges and future perspectives follow.

Keywords: Adversarial autoencoder; Deep learning; Drug discovery; Generative adversarial network; Generative model; Recurrent neural network; Variational autoencoder.

PubMed Disclaimer

Conflict of interest statement

Competing interest The authors declare no competing interest.

Figures

Fig. 1
Fig. 1
From artificial intelligence to deep learning. a The programming paradigm for symbolic AI. b The programming paradigm for ML. c The relationship among artificial intelligence, machine learning, and deep learning
Fig. 2
Fig. 2
The RNN, the LSTM, and their application in generative chemistry. a The schematic illustration of the RNN, the neural network with an internal loop. b The schematic illustration of data processing with the LSTM. c The typical framework on building generative models applying RNN for molecules generation
Fig. 3
Fig. 3
The autoencoder and the variational autoencoder. a An autoencoder encodes input molecules into compressed representations and decodes them back. b A variational autoencoder maps the molecules into the parameters of a statistical distribution as the latent space is a continuous numerical representation
Fig. 4
Fig. 4
The illustrated architecture of an adversarial autoencoder. A discriminator network is appended to calculate the adversarial cost for discriminating p(z) from qφ(z). As a result, the outcome latent space from the encoder is driven to follow the prior distribution
Fig. 5
Fig. 5
Sample architecture of the convolutional neural network and the framework of a generative adversarial network. a The careful selection and arrangement of convolutional layers, pooling layers, and dense layers, etc. constitute a convolutional neural network. b The generative adversarial network comprises two modules, the generator and the discriminator. Both the generative loss and discriminative loss are monitored during the training process

Similar articles

Cited by

References

    1. Wouters OJ, McKee M, Luyten J (2020) Estimated research and development investment needed to bring a new medicine to market, 2009-2018. Jama 323:844–853 - PMC - PubMed
    1. DiMasi JA, Grabowski HG, Hansen RW (2016) Innovation in the pharmaceutical industry: new estimates of R&D costs. J Health Econ 47:20–33 - PubMed
    1. Yasi EA, Kruyer NS, Peralta-Yahya P (2020) Advances in G protein-coupled receptor high-throughput screening. Curr Opin Biotechnol 64:210–217 - PMC - PubMed
    1. Blay V, Tolani B, Ho SP, Arkin MR (2020) High-Throughput Screening: today’s biochemical and cell-based approaches. Drug Discov Today 25:1807–1821 - PubMed
    1. Kroemer RT (2007) Structure-based drug design: docking and scoring. Curr Protein Pept Sci 8:312–328 - PubMed