Features:
- Implementation based on microservices architecture
- Launch all with one
docker-compose up
- Services run in isolated docker container
- Submit tasks with RESTful API (FastAPI)
- Separated task queues and concurrence control
- Flower for monitoring the Celery tasks
AIRFold is an open-source platform for protein structure prediction.
Please follow these steps:
-
Install Docker.
- Install NVIDIA Container Toolkit for GPU support.
- Setup running Docker as a non-root user.
-
Clone this repository and
cd
into it.git clone https://github.com/health-air/AIRFold cd ./AIRFold docker-compose up
-
Check the page:
- Submit page: http://127.0.0.1
- FastAPI page: http://127.0.0.1:8081/docs
- Tasks monitor page (powered by flower): http://127.0.0.1:5555
Note: please change IP address and ports accordingly, they are specified in docker-compose.yml
Genomics and metagenomics sequence databases
- BFD,
- MGnify,
- UniRef90,
- NR database for BLAST,
- Genomics and metagenomics sequence databases for DeepMSA2,
- ColabFold dataset for MMseqs2,
- Small BFD sequence database
- Uniprot sequence database
Structure databases
Data structure
├── model_params (models and parameters for AlphaFold2, RoseTTAFold2, ect.)
├── bfd
├── blast_dbs
├── JGIclust
├── metaclust
├── mgnify
├── pdb70
├── pdb_mmcif
├── small_bfd
├── uniclust30
├── uniref30
└── uniref90
MSA-based structure prediction
Single sequence-based structure prediction
Multiple sequence alignment generation
Multiple sequence alignment selection
Protein model quality assessment
# Input: Protein sequences in fasta format.
# Output: Multiple sequence alignment results in a3m format.
python run_mode.py --input_path example.fasta --mode msa
# Input: Protein sequences in fasta format.
# Output: Generated sequence embeddings in pickle format.
python run_mode.py --input_path example.fasta --mode feature
# Input: Protein sequences in fasta format.
# Output: Generated contact map in pickle format.
python run_mode.py --input_path example.fasta --mode disgram
# Input: Protein sequences in fasta format.
# Output: Protein structure in pdb format.
python run_mode.py --input_path example.fasta --mode pipline
If you find our open-sourced code & models helpful to your research, please also consider star🌟 and cite📑 this repo. Thank you for your support!
@misc{AIRFold_code,
author={Hongliang, Li and Xin, Hong and Yanyan, Lan},
title={Code of AIRFold},
year={2024},
howpublished = {\url{https://github.com/health-air/AIRFold}}
}
I also recommended that you reference the third-party tools (listed above) you use.
Copyright 2024 health-air.
Extended from AlphaFold, OpenComplex is licensed under the permissive Apache Licence, Version 2.0.
If you encounter problems using AIRFold, feel free to create an issue! We also welcome pull requests from the community.
For help or issues using the repos, please submit a GitHub issue.
For other communications, please contact Yanyan Lan (lanyanyan@air.tsinghua.edu.cn).