Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures (NeurIPS 2022)
conda env create -f env.yaml -n diffab
conda activate diffab
The default cudatoolkit
version is 11.3. You may change it in env.yaml
.
Protein structures in the SAbDab
dataset can be downloaded here. Extract all_structures.zip
into the data
folder.
The data
folder contains a snapshot of the dataset index (sabdab_summary_all.tsv
). You may replace the index with the latest version here.
Trained model weights are available here (Hugging Face) or here (Google Drive).
HDOCK is required to design CDRs for antigens without bound antibody frameworks. Please download HDOCK here and put the hdock
and createpl
programs into the bin
folder.
PyRosetta is required to relax the generated structures and compute binding energy. Please follow the instruction here to install.
Ray is required to relax and evaluate the generated antibodies. Please install Ray using the following command:
pip install -U ray
5 design modes are available. Each mode corresponds to a config file in the configs/test
folder:
Config File | Description |
---|---|
codesign_single.yml |
Sample both the sequence and structure of one CDR. |
codesign_multicdrs.yml |
Sample both the sequence and structure of all the CDRs simultaneously. |
abopt_singlecdr.yml |
Optimize the sequence and structure of one CDR. |
fixbb.yml |
Sample only the sequence of one CDR (fix-backbone sequence design). |
strpred.yml |
Sample only the structure of one CDR (structure prediction). |
Below is the usage of design_pdb.py
. It samples CDRs for antibody-antigen complexes. The full list of options can be found in diffab/tools/runner/design_for_pdb.py
.
python design_pdb.py \
<path-to-pdb> \
--heavy <heavy-chain-id> \
--light <light-chain-id> \
--config <path-to-config-file>
The --heavy
and --light
options can be omitted as the script can automatically identify them with AbNumber and ANARCI.
The below example designs the six CDRs separately for the 7DK2_AB_C
antibody-antigen complex.
python design_pdb.py ./data/examples/7DK2_AB_C.pdb \
--config ./config/test/codesign_single.yml
HDOCK is required to design antibodies for antigens without bound antibody structures (see above for instructions on installing HDOCK). Below is the usage of design_dock.py
.
python design_dock.py \
--antigen <path-to-antigen-pdb> \
--antibody <path-to-antibody-template-pdb> \
--config <path-to-config-file>
The --antibody
option is optional and the default antibody template is 3QHF_Fv.pdb
. The full list of options can be found in the script.
Below is an example that designs antibodies for SARS-CoV-2 Omicron RBD.
python design_dock.py \
--antigen ./data/examples/Omicron_RBD.pdb \
--config ./config/test/codesign_multicdrs.yml
python train.py ./configs/train/<config-file-name>
@inproceedings{luo2022antigenspecific,
title={Antigen-Specific Antibody Design and Optimization with Diffusion-Based Generative Models for Protein Structures},
author={Shitong Luo and Yufeng Su and Xingang Peng and Sheng Wang and Jian Peng and Jianzhu Ma},
booktitle={Advances in Neural Information Processing Systems},
editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
year={2022},
url={https://openreview.net/forum?id=jSorGn2Tjg}
}