Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 15;41(24):e111071.
doi: 10.15252/embj.2022111071. Epub 2022 Oct 31.

Subcellular location of source proteins improves prediction of neoantigens for immunotherapy

Affiliations

Subcellular location of source proteins improves prediction of neoantigens for immunotherapy

Andrea Castro et al. EMBO J. .

Abstract

Antigen presentation via the major histocompatibility complex (MHC) is essential for anti-tumor immunity. However, the rules that determine which tumor-derived peptides will be immunogenic are still incompletely understood. Here, we investigated whether constraints on peptide accessibility to the MHC due to protein subcellular location are associated with peptide immunogenicity potential. Analyzing over 380,000 peptides from studies of MHC presentation and peptide immunogenicity, we find clear spatial biases in both eluted and immunogenic peptides. We find that including parent protein location improves the prediction of peptide immunogenicity in multiple datasets. In human immunotherapy cohorts, the location was associated with a neoantigen vaccination response, and immune checkpoint blockade responders generally had a higher burden of neopeptides from accessible locations. We conclude that protein subcellular location adds important information for optimizing cancer immunotherapies.

Keywords: Immunotherapy; immunogenicity; major histocompatibility complex; neoantigen; subcellular location.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Overview of location features from eluted peptides and T‐cell assayed neoepitopes from IEDB
  1. A

    Scatterplot of clustered UMAP location features (Materials and Methods) for source proteins of IEDB T‐cell assayed neoepitopes with COSMIC cancer genes highlighted.

  2. B

    Table annotating location clusters from the scatterplot with highlighted COSMIC cancer genes. PANTHER GO Slim CC used here for simplified terms. #Many terms so only the top five are shown.

  3. C, D

    (C) Area under the receiver operating characteristic curve (AUROC) and (D) area under the precision–recall curve (AUPRC) for 10‐fold cross‐validation using a Random Forest model to predict protein elution in 721.221 B cells incorporating location, matched expression, or both (Materials and Methods). The faded lines indicate the respective area under the curve for each split.

  4. E, F

    (E) AUROC and (F) AUPRC for 10‐fold cross‐validation using a Random Forest model to predict immunogenicity in IEDB assayed neopeptides incorporating peptide affinity, stability, and foreignness (Materials and Methods) with and without parent protein location features.

  5. G

    Barplot of model feature importance.

Figure 2
Figure 2. Predicting immunogenicity on the Wells validation dataset
  1. A, B

    (A) Area under the receiver operating characteristic curve (AUROC) (B) and area under the precision–recall curve (AUPRC) for the unseen validation dataset with and without parent protein location features.

  2. C

    Scatterplot of the predicted probabilities for unseen test neopeptides to be immunogenic with and without location as a feature. Dashed lines indicate the Youden index for each model, used for optimal threshold predictions. False positives are reduced in the model with location.

Figure 3
Figure 3. ICB responders carry a higher burden of mutations in proteins from immunogenic locations
  1. A

    Predicted neoantigen burden versus response category in immunotherapy cohorts when retaining only mutations in proteins from subcellular locations previously observed to source immunogenic peptides.

  2. B

    Predicted neoantigen burden versus response category in immunotherapy cohorts where neoantigen status is predicted using a model trained on three sources of immunogenic peptide and features including peptide–MHC affinity, stability, and location.

  3. C, D

    Barplots of effect sizes between responders and non‐responders (C) where TMB is filtered to include only mutations from subcellular locations previously observed to source immunogenic peptides and (D) where neoantigen status is predicted using a model trained on three sources of immunogenic peptide and MHC affinity and stability, with and without location.

Data information: Panels A and B include the median line, the boxes denote the interquartile range (IQR), whiskers denote the rest of the data distribution, and outliers are denoted by points determined by ± 1.5 * IQR. Sample size indicates number of patients. The Mann–Whitney U statistical test was performed.
Figure 4
Figure 4. Focusing on immunogenic locations improves response prediction in a gene panel profiled cohort
  1. A, B

    Tumor mutation burden focusing on (A) all genes in the gene panel and (B) the 40 genes whose proteins localize to previously observed immunogenic subcellular locations.

  2. C, D

    Kaplan–Meier curves showing the effect of the best presented mutation on progression‐free survival (C) using all genes in the panel and (D) using only the 40 genes of interest.

Data information: Panels A and B include the median line, the boxes denote the interquartile range (IQR), whiskers denote the rest of the data distribution, and outliers are denoted by points determined by ± 1.5 * IQR. Sample size indicates number of patients. The Mann–Whitney U statistical test was performed.

Similar articles

Cited by

References

    1. Abelin JG, Keskin DB, Sarkizova S, Hartigan CR, Zhang W, Sidney J, Stevens J, Lane W, Zhang GL, Eisenhaure TM et al (2017) Mass spectrometry profiling of HLA‐associated peptidomes in mono‐allelic cells enables more accurate epitope prediction. Immunity 46: 315–326 - PMC - PubMed
    1. Abelin JG, Harjanto D, Malloy M, Suri P, Colson T, Goulding SP, Creech AL, Serrano LR, Nasir G, Nasrullah Y et al (2019) Defining HLA‐II ligand processing and binding rules with mass spectrometry enhances cancer epitope prediction. Immunity 51: 766–779 - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25: 25–29 - PMC - PubMed
    1. Bassani‐Sternberg M, Pletscher‐Frankild S, Jensen LJ, Mann M (2015) Mass spectrometry of human leukocyte antigen class I peptidomes reveals strong effects of protein abundance and turnover on antigen presentation. Mol Cell Proteomics 14: 658–673 - PMC - PubMed
    1. Borden ES, Buetow KH, Wilson MA, Hastings KT (2022) Cancer neoantigens: challenges and future directions for prediction, prioritization, and validation. Front Oncol 12. 10.3389/fonc.2022.836821 - DOI - PMC - PubMed

Publication types