Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 1;40(7):btae405.
doi: 10.1093/bioinformatics/btae405.

Geometric epitope and paratope prediction

Affiliations

Geometric epitope and paratope prediction

Marco Pegoraro et al. Bioinformatics. .

Abstract

Motivation: Identifying the binding sites of antibodies is essential for developing vaccines and synthetic antibodies. In this article, we investigate the optimal representation for predicting the binding sites in the two molecules and emphasize the importance of geometric information.

Results: Specifically, we compare different geometric deep learning methods applied to proteins' inner (I-GEP) and outer (O-GEP) structures. We incorporate 3D coordinates and spectral geometric descriptors as input features to fully leverage the geometric information. Our research suggests that different geometrical representation information is useful for different tasks. Surface-based models are more efficient in predicting the binding of the epitope, while graph models are better in paratope prediction, both achieving significant performance improvements. Moreover, we analyze the impact of structural changes in antibodies and antigens resulting from conformational rearrangements or reconstruction errors. Through this investigation, we showcase the robustness of geometric deep learning methods and spectral geometric descriptors to such perturbations.

Availability and implementation: The python code for the models, together with the data and the processing pipeline, is open-source and available at https://github.com/Marco-Peg/GEP.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
The GEP model processes an unbound antibody–antigen pair as input, predicting the probability of each residue binding with the counterpart molecule. Predicted binding residues are visually emphasized by colored circles (blue for antibody CDR and red for the antigen), with the filled circles indicating the predicted residues. The corresponding bound pair is illustrated on the right.
Figure 2.
Figure 2.
Models architecture: The layers or modules are depicted using color-coded blocks, with the text inside indicating the respective layer type. In parentheses, we provide the dimensions for each layer: GCN, E(n) invariant layer (EGNN), Graph Attention Layer (GAT), and FC. The arrows indicate the data flow from one module to the next. Additional details about the transformation performed on the input are written. (a) GCN I-GEP. (b) EGNN I-GEP
Figure 3.
Figure 3.
Our model architecture is represented with arrows indicating data flow between modules, using colour-coded blocks to represent layers or modules, with text inside each block specifying the layer type. The model takes antibody-antigen pairs as input, featuring surface point-level features, and produces binding probabilities for each input point. (a) Overall structure of the O-GEP model. (b) Geometric module: The protein representation is first passed through an MLP layer before entering the diffusion block as defined in Sharp et al. (2022). The local and global features are computed by applying the diffusion block a single and n times, respectively. (c) Segmentation module: The output of the geometric module is concatenated into two vectors for the antigen and the antibody, respectively. These representations are then sent through the segmentation module to output the binding prediction on the antigen and antibody, respectively. The segmentation module is shared across the two representations and consists of convolutional layers.
Figure 4.
Figure 4.
Qualitative comparison between experimental and Alpha-Fold 2 predicted complex ‘7e9b’. The continuous binding predictions are represented as a color gradient in blue and red for the antigen and antibody, respectively. (a) Secondary structure, (b) E(n)-EPMP, (c) PiNet (xyz+hks), (d) DiffNetpc (xyz), (e) DiffNetmesh (hks), (f) secondary structure, (g) E(n)-EPMP, (h) PiNet (xyz+hks), (i) DiffNetpc (xyz), and (j) DiffNetmesh (hks)

Similar articles

References

    1. Cia G, Pucci F, Rooman M.. Critical review of conformational b-cell epitope prediction methods. Brief Bioinform 2023;24:bbac567. - PubMed
    1. da Silva BM, Myung Y, Ascher DB. et al. epitope3d: a machine learning method for conformational b-cell epitope prediction. Brief Bioinform 2022;23:bbab423. - PubMed
    1. Dai B, Bailey-Kellogg C.. Protein interaction interface region prediction by geometric deep learning. Bioinformatics 2021;37:2580–8. - PMC - PubMed
    1. Deac A, Veličković P, Sormanni P.. Attentive cross-modal paratope prediction. J Comput Biol 2019;26:536–45. - PubMed
    1. Del Vecchio A, Deac A, Liò P. et al. Neural message passing for joint paratope-epitope prediction. ICML Workshop on Computational Biology 2022.

Publication types