This server is a protein structure prediction tool that processes prediction requests from users and capable of returning various scores for protein sequences.
To install the environment, follow these steps:
git clone https://github.com/Oaklight/protein-score-server.git
cd protein-score-server
conda env create -f env/environment.yaml
conda activate esm
pip install -r env/requirements.txt
Configuration File:
- Copy
server.yaml.sample
toserver.yaml
:
cp server.yaml.sample server.yaml
- Edit
server.yaml
with your settings.
The server uses the server.yaml
file for configuration. Currently configurable items include:
api_key
: API key for Hugging Face Hub login.history_path
: History result storage path.intermediate_pdb_path
: Intermediate PDB file storage path.model
: Model configurationname
: model name,esmfold
orprotenix (bytedances' alphafold3 implementation)
replica
: GPU device and replications mapping, should be in<device>: <num_replica>
format. Foresmfold
case, use_: <num_replica>
instead.
task_queue_size
: Task queue size, default to 50.timeout
: Timeout for async prediction result retrieval, default to 15 seconds.backbone_pdb
:reversed_index
: path for reverse index from pdb id to pdb file pathparquet_prefix
: path prefix for parquet filespdb_prefix
: path prefix for pdb files
For example, see server.yaml
After the config are set, run these commands inside the project folder:
conda activate esm
uvicorn main:app --host 0.0.0.0 --port 8000
Users can send POST
requests to http://your-host:8000/predict/
to get predictions. The request body comprises of these fields: seq
, name
, type
, seq2
.
seq
: String, representing the protein sequence.name
: String, representing the name of the reference protein.type
: String, representing the task type, currently supports "plddt", "tmscore", "sc-tmscore", "pdb".seq2
: String, representing the sequence of the reference protein. Used only forsc-tmscore
task. You may choose to provide eitherseq2
orname
- pLDDT
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "plddt"
}
- TMscore
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # must provide for tasks that require a reference structure
"type": "tmscore"
}
- sc-TMscore
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"seq2": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST", # choose to provide either seq2 or name
"type": "sc-tmscore"
}
or
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # choose to provide either seq2 or name
"type": "sc-tmscore"
}
- pdb
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "pdb"
}
The server will return a JSON response containing two fields: job_id
and prediction
.
job_id
: String, representing the task ID.prediction
: String, currently only indicating the prediction is in processing.
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": ... # not very useful at this moment
}
Users can send GET
requests to http://your-host:8000/result/{job_id}
to get prediction results. The header of the request should contain Content-Type: application/json
.
The server will return a JSON response containing two fields: job_id
and prediction
.
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": 0.983124
}
- 202: Job is being processed. Please wait.
- 400: Task input information error. Check detailed messages.
- 404: Task ID does not exist in server records.
- 429: Server job queue is currently full. Please wait.
- Recommend to use an exponential backoff strategy with a base of 3 when querying for results.
- Example of querying is available in
test.py
.
To stop the server, use Ctrl+C
in the terminal where the server is running.
This server is licensed under the Apache License 2.0.