Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 30;18(9):e1010505.
doi: 10.1371/journal.pcbi.1010505. eCollection 2022 Sep.

Computational multiplex panel reduction to maximize information retention in breast cancer tissue microarrays

Affiliations

Computational multiplex panel reduction to maximize information retention in breast cancer tissue microarrays

Luke Ternes et al. PLoS Comput Biol. .

Abstract

Recent state-of-the-art multiplex imaging techniques have expanded the depth of information that can be captured within a single tissue sample by allowing for panels with dozens of markers. Despite this increase in capacity, space on the panel is still limited due to technical artifacts, tissue loss, and long imaging acquisition time. As such, selecting which markers to include on a panel is important, since removing important markers will result in a loss of biologically relevant information, but identifying redundant markers will provide a room for other markers. To address this, we propose computational approaches to determine the amount of shared information between markers and select an optimally reduced panel that captures maximum amount of information with the fewest markers. Here we examine several panel selection approaches and evaluate them based on their ability to reconstruct the full panel images and information within breast cancer tissue microarray datasets using cyclic immunofluorescence as a proof of concept. We show that all methods perform adequately and can re-capture cell types using only 18 of 25 markers (72% of the original panel size). The correlation-based selection methods achieved the best single-cell marker mean intensity predictions with a Spearman correlation of 0.90 with the reduced panel. Using the proposed methods shown here, it is possible for researchers to design more efficient multiplex imaging panels that maximize the amount of information retained with the limited number of markers with respect to certain evaluation metrics and architecture biases.

PubMed Disclaimer

Conflict of interest statement

I have read the journal’s policy and the authors of this manuscript have the following competing interests: L.T., J.R.L., Y.C and Y.H.C have no competing interests. J.W.G has licensed technologies to Abbott Diagnostics; has ownership positions in Convergent Genomics, Health Technology Innovations, Zorro Bio, and PDX Pharmaceuticals; serves as a paid consultant to New Leaf Ventures; has received research support from Thermo Fisher Scientific (formerly FEI), Zeiss, Miltenyi Biotech, Quantitative Imaging, Health Technology Innovations, and Micron Technologies.

Figures

Fig 1
Fig 1. An illustrative schema for panel reduction and prediction: In order to select an optimally reduced panel from a designed full panel, four different selection methods were tested: intensity correlation-based, sparse subspace-based, gradient-based, and random selection.
Using the reduced panels selected from each method, a ME-VAE was used to impute the full panel set. The full set of imputations were then evaluated by comparing them to the original images using important features of downstream analyses such as mean intensity correlations, structural similarity index measure, and cluster overlap using Normalized Mutual Information.
Fig 2
Fig 2. A proof of concept full panel imputation: 12 stains were randomly selected to create a reduced panel which were then used to train a ME-VAE to reconstruct the full panel of stains.
Here we show a representative 3 stains from the included and withheld marker sets across 8 cells. Real and predicted staining and their compilation images are shown side by side to qualitatively demonstrate that a reduced panel can reconstruct relevant unseen information.
Fig 3
Fig 3. All panel selection methods were evaluated across a range of panel sizes to determine how well their reduced panels can be used to reconstruct the full panel.
Spearman correlation was measured for each stain independently and then averaged across the whole dataset. Variance in mean correlation accuracy was also calculated across markers for each panel size and method (n = 25). The data was split into withheld markers (left) and all markers (right) to illustrate each model’s generalizability and performance in both domains. 1-to-1 substitutions of marker intensity were used as the baseline, where makers withheld from the reduced panel set were simply assigned the intensity of their closest match as described in the Methods section.
Fig 4
Fig 4. The full panels of six different breast cancer subtypes and normal were predicted using the highest performing reduced panel (correlation-based selection with 18 markers).
Spearman correlations were calculated between the full panel expressions and the expressions of the predicted markers for the included markers and excluded markers separately. The expression correlation plots for the best and worst predicted markers are shown for each subtype.
Fig 5
Fig 5. A sample of a few of the lowest scoring and highest scoring makers were selected to directly compare the Spearman correlations across all the breast cancer subtypes.
True and predicted expressions were compared for each marker and subtype individually.
Fig 6
Fig 6. Clustering was performed on the full panel intensities to generate ground truth cell type clusters using k-means (k = 10, chosen with elbow method on silhouette score) and PhenoGraph22 (with nearest neighbors set to 500 and minimum cluster size set to 2000).
Random, correlation-based, gradient-based, and subspace-based selection methods were also clustered using reconstructed intensities as input to k-means and PhenoGraph using the same parameters. Clustering similarity to ground truth was performed using normalized mutual information (NMI). A baseline NMI for comparison was generated using randomly shuffled cluster labels. The clusters were projected into a UMAP23 embedding and plotted to visually show the cluster results. For k-means clustering, cluster colors were matched for all selection methods by pairing each cluster’s cell compositions with the full panel. This was not done in PhenoGraph because each method had a different number of clusters, preventing 1-to-1 pairing. Outliers from PhenoGraph clustering are shown in grey.
Fig 7
Fig 7. Heatmap of mean marker intensity correlations in the full TMA panel set, computed across single cell images.
Heatmap visualization is clustered using hierarchical clustering of rows and columns. Highly correlated marker clusters show where markers can potentially predict one another and thus can be reduced. Markers with no good correlates will likely need to be included in a reduced panel as there will be no other marker that is predictive of their expression (using intensity information alone). Baseline 1-to-1 substitution will use these correlations to determine marker pairs for intensity substitution. Correlation-based selection will combinatorially create and test all possible panels of size n to determine which reduced panel produces the max correlation to all withheld markers.
Fig 8
Fig 8. Diagram demonstrating the trained coefficient matrix and the resultant interaction map used to select a reduced panel.
A model is trained to optimize the Coefficient Matrix (C) with a forced zero diagonal, such that it is sparse and when multiplied by the intensity vector of each single cell (I) it can reconstruct I as closely as possible. The resultant interaction map is the trained weights of C, showing the interactions of each marker necessary to adequately reconstruct each other marker in an image. Some makers are capable of being reconstructed from only one other marker, other markers require a more complex combination, and some are not well predicted by any.
Fig 9
Fig 9. A multi encoder variational autoencoder architecture is implemented with each channel being used as the input to parallel encoders.
The encodings of each channel are concatenated and decoded into a full panel image. The gradients of the model are then backpropagated to the encoding layer. If the magnitude of the gradient is interpreted as importance, the channel gradients can be averaged across the dataset to determine which markers are most important for reconstructing image features within the model.

Similar articles

Cited by

References

    1. Tsujikawa T., Kumar S., Borkar R. N., Azimi V., Thibault G., Chang Y. H., et al.. (2017). Quantitative multiplex immunohistochemistry reveals myeloid-inflamed tumor-immune complexity associated with poor prognosis. Cell reports, 19(1), 203–217. doi: 10.1016/j.celrep.2017.03.037 - DOI - PMC - PubMed
    1. Johnson B. E., Creason A. L., Stommel J. M., Keck J. M., Parmar S., Betts C. B., et al.. (2022). An omic and multidimensional spatial atlas from serial biopsies of an evolving metastatic breast cancer. Cell Reports Medicine, 3(2), 100525. doi: 10.1016/j.xcrm.2022.100525 - DOI - PMC - PubMed
    1. Burlingame E. A., Eng J., Thibault G., Chin K., Gray J. W., & Chang Y. H. (2021). Toward reproducible, scalable, and robust data analysis across multiplex tissue imaging platforms. Cell reports methods, 1(4), 100053. doi: 10.1016/j.crmeth.2021.100053 - DOI - PMC - PubMed
    1. Labrie M., Li A., Creason A., Betts C., Keck J., Johnson B., et al.. (2021). Multiomics analysis of serial PARP inhibitor treated metastatic TNBC inform on rational combination therapies. NPJ precision oncology, 5(1), 1–11. - PMC - PubMed
    1. Angelo M., Bendall S. C., Finck R., Hale M. B., Hitzman C., Borowsky A. D., et al.. (2014). Multiplexed ion beam imaging of human breast tumors. Nature medicine, 20(4), 436–442. doi: 10.1038/nm.3488 - DOI - PMC - PubMed

Publication types