Skip to content

THU-ATOM/AIRFold

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 Cannot retrieve latest commit at this time.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AIRFold

Features:

  • Implementation based on microservices architecture
  • Launch all with one docker-compose up
  • Services run in isolated docker container
  • Submit tasks with RESTful API (FastAPI)
  • Separated task queues and concurrence control
  • Flower for monitoring the Celery tasks

Introduction

AIRFold Framework

AIRFold is an open-source platform for protein structure prediction.

Quick Start

Installation and running your first prediction

Please follow these steps:

  1. Install Docker.

  2. Clone this repository and cd into it.

    git clone https://github.com/health-air/AIRFold
    cd ./AIRFold
    docker-compose up
  3. Check the page:

Note: please change IP address and ports accordingly, they are specified in docker-compose.yml

Databases for AIRFold

Genomics and metagenomics sequence databases

Structure databases

Data structure

├── model_params (models and parameters for AlphaFold2, RoseTTAFold2, ect.)
├── bfd
├── blast_dbs
├── JGIclust
├── metaclust
├── mgnify
├── pdb70
├── pdb_mmcif
├── small_bfd
├── uniclust30
├── uniref30
└── uniref90

Third-party tools

MSA-based structure prediction

Single sequence-based structure prediction

Multiple sequence alignment generation

Multiple sequence alignment selection

Protein model quality assessment

Command for different functions

Multiple sequence alignment generation

# Input: Protein sequences in fasta format.
# Output: Multiple sequence alignment results in a3m format.
python run_mode.py --input_path example.fasta --mode msa

Pretrained embedding generation

# Input: Protein sequences in fasta format.
# Output: Generated sequence embeddings in pickle format.
python run_mode.py --input_path example.fasta --mode feature

Protein contact map prediction

# Input: Protein sequences in fasta format.
# Output: Generated contact map in pickle format.
python run_mode.py --input_path example.fasta --mode disgram

Protein structure prediction

# Input: Protein sequences in fasta format.
# Output: Protein structure in pdb format.
python run_mode.py --input_path example.fasta --mode pipline

Citation

If you find our open-sourced code & models helpful to your research, please also consider star🌟 and cite📑 this repo. Thank you for your support!

@misc{AIRFold_code,
  author={Hongliang, Li and Xin, Hong and Yanyan, Lan},
  title={Code of AIRFold},
  year={2024},
  howpublished = {\url{https://github.com/health-air/AIRFold}}
}

I also recommended that you reference the third-party tools (listed above) you use.

License and Disclaimer

Copyright 2024 health-air.

Extended from AlphaFold, OpenComplex is licensed under the permissive Apache Licence, Version 2.0.

Contributing

If you encounter problems using AIRFold, feel free to create an issue! We also welcome pull requests from the community.

Contact Information

For help or issues using the repos, please submit a GitHub issue.

For other communications, please contact Yanyan Lan (lanyanyan@air.tsinghua.edu.cn).