Skip to content

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias

License

Notifications You must be signed in to change notification settings

SYNBI-KAIST/HJ-DTA-DataBias

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias

Overview

Deep learning-based drug-target affinity (DTA) prediction models have shown high performance but suffer from dataset bias. Our study investigates this bias using comprehensive databases and demonstrates that compound-protein binding affinity can often be predicted using compound features alone, due to high similarity among target proteins. We developed bias-reduced datasets by decreasing protein similarity between training and test sets, which improved model performance and balanced feature importance.

figure

We introduce the Binding Affinity Similarity Explorer (BASE) web service, which offers bias-reduced datasets and prediction results to aid in the development of generalized and robust DTA models. BASE is freely available at https://synbi2024.kaist.ac.kr/base.

Figure9

Installation Instructions

To run the project locally, clone the repository:

git clone https://github.com/yourusername/HJ-DTA-DataBias.git
cd HJ-DTA-DataBias

Journal & Contact Info

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias.

Hyojin Son*, Sechan Lee, Jaeuk Kim, Haangik Park, Myeong-Ha Hwang and Gwan-Su Yi†

Acknowledgement

This work was supported by the BK-21 program through National Research Foundation of Korea (NRF) under Ministro of Education.

License

The code in this repository is licensed under the MIT License. See the LICENSE file for more details.

The data in the data folder is licensed under the CC BY 4.0 International license. See the LICENSE file for more details.

About

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published