A PyTorch implementation of “GxVAEs: Two Joint VAEs Generate Hit Molecules from Gene Expression Profiles“. The paper has been accepted by AAAI 2024 (Main track paper and oral presentation).
This implementation was developed by Chen Li (li.chen.z2@a.mail.nagoya-u.ac.jp) and Yoshihiro Yamanishi (yamanishi@i.nagoya-u.ac.jp), affiliated with the Department of Complex Systems Science at the Graduate School of Informatics, Nagoya University, Japan, at the time of release.
GxVAEs aim to
- generate hit-like molecules from gene expression profiles.
- generate therapeutic molecules from patients’ disease profiles.
Execute the following command:
$ conda env create -n gxvae_env -f gxvaes_env.yml
$ source activate gxvaes_env
- The datasets Folder
- LINCS/mcf7.csv: The training and validation datasets, which consist of gene expression profiles of the MCF7 cell line treated with 13,755 molecules, were used.
- tools floder
- main.py:: Define the main function for training the ProfileVAE and MolVAE models.
- ProfileVAE.py: Defines the ProfileVAE model for extracting gene expression profile features.
- train_gene_vae.py: Code for training the ProfileVAE model.
- MolVAE.py: Defines the MolVAE model to generate SMILES strings with extracted gene features.
- train_smiles_vae.py: Code for training the MolVAE model.
- utils.py: Defines other functions used in GxVAEs.
- STEP 1: Pretrain ProfileVAE:
$ python main.py --train_gene_vae
- STEP 2: Test the trained ProfileVAE:
$ python main.py --test_gene_vae
- STEP 3: Train MolVAE:
$ python main.py --train_smiles_vae
- STEP 4: Test the trained MolVAE:
$ python main.py --test_smiles_vae
- STEP 5: Generate molecules for the 10 ligands using GxVAEs
$ python main.py --generation
- STEP 6: Calculate Tanimoto similarity between a source ligand and generated SMILES strings:
$ python main.py --calculate_tanimoto --protein_name ***
Note that '***' indicates a protein name, such as 'AKT1'.
If you have any questions, please feel free to contact Chen Li at li.chen.z2@a.mail.nagoya-u.ac.jp.
C. Li and Y. Yamanishi (2024). GxVAEs: Two Joint VAEs Generate Hit Molecules from Gene Expression Profiles.
BibTeX format:
@inproceedings{li2024gxvaes,
title={GxVAEs: Two Joint VAEs Generate Hit Molecules from Gene Expression Profiles},
author={Li, Chen and Yamanishi, Yoshihiro},
booktitle={Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024)},
year={2024}
}