Convolutional neural network transformer (CNNT) for fluorescence microscopy image denoising with improved generalization and fast adaptation

doi:10.1038/s41598-024-68918-2

. 2024 Aug 6;14(1):18184.

doi: 10.1038/s41598-024-68918-2.

Convolutional neural network transformer (CNNT) for fluorescence microscopy image denoising with improved generalization and fast adaptation

Azaan Rehman¹, Alexander Zhovmer², Ryo Sato³, Yoh-Suke Mukouyama³, Jiji Chen⁴, Alberto Rissone⁵, Rosa Puertollano⁵, Jiamin Liu⁴, Harshad D Vishwasrao⁴, Hari Shroff⁶, Christian A Combs⁷, Hui Xue^{1

8}

Affiliations

¹ Office of AI Research, National Heart, Lung and Blood Institute (NHLBI), National Institutes of Health (NIH), Bethesda, MD, 20892, USA.
² Center for Biologics Evaluation and Research, U.S. Food and Drug Administration (FDA), Silver Spring, MD, 20903, USA.
³ Laboratory of Stem Cell and Neurovascular Research, NHLBI, NIH, Bethesda, MD, 20892, USA.
⁴ Advanced Imaging and Microscopy Resource, NIBIB, NIH, Bethesda, MD, 20892, USA.
⁵ Laboratory of Protein Trafficking and Organelle Biology, NHLBI, NIH, Bethesda, MD, 20892, USA.
⁶ Janelia Research Campus, Howard Hughes Medical Institute (HHMI), Ashburn, VA, USA.
⁷ Light Microscopy Core, National Heart, Lung, and Blood Institute, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD, 20892, USA. combsc@nhlbi.nih.gov.
⁸ Health Futures, Microsoft Research, Redmond, Washington, 98052, USA.

PMID: 39107416
PMCID: PMC11303381
DOI: 10.1038/s41598-024-68918-2

Convolutional neural network transformer (CNNT) for fluorescence microscopy image denoising with improved generalization and fast adaptation

Azaan Rehman et al. Sci Rep. 2024.

. 2024 Aug 6;14(1):18184.

doi: 10.1038/s41598-024-68918-2.

Authors

Affiliations

¹ Office of AI Research, National Heart, Lung and Blood Institute (NHLBI), National Institutes of Health (NIH), Bethesda, MD, 20892, USA.
² Center for Biologics Evaluation and Research, U.S. Food and Drug Administration (FDA), Silver Spring, MD, 20903, USA.
³ Laboratory of Stem Cell and Neurovascular Research, NHLBI, NIH, Bethesda, MD, 20892, USA.
⁴ Advanced Imaging and Microscopy Resource, NIBIB, NIH, Bethesda, MD, 20892, USA.
⁵ Laboratory of Protein Trafficking and Organelle Biology, NHLBI, NIH, Bethesda, MD, 20892, USA.
⁶ Janelia Research Campus, Howard Hughes Medical Institute (HHMI), Ashburn, VA, USA.
⁷ Light Microscopy Core, National Heart, Lung, and Blood Institute, National Institutes of Health, 9000 Rockville Pike, Bethesda, MD, 20892, USA. combsc@nhlbi.nih.gov.
⁸ Health Futures, Microsoft Research, Redmond, Washington, 98052, USA.

PMID: 39107416
PMCID: PMC11303381
DOI: 10.1038/s41598-024-68918-2

Abstract

Deep neural networks can improve the quality of fluorescence microscopy images. Previous methods, based on Convolutional Neural Networks (CNNs), require time-consuming training of individual models for each experiment, impairing their applicability and generalization. In this study, we propose a novel imaging-transformer based model, Convolutional Neural Network Transformer (CNNT), that outperforms CNN based networks for image denoising. We train a general CNNT based backbone model from pairwise high-low Signal-to-Noise Ratio (SNR) image volumes, gathered from a single type of fluorescence microscope, an instant Structured Illumination Microscope. Fast adaptation to new microscopes is achieved by fine-tuning the backbone on only 5-10 image volume pairs per new experiment. Results show that the CNNT backbone and fine-tuning scheme significantly reduces training time and improves image quality, outperforming models trained using only CNNs such as 3D-RCAN and Noise2Fast. We show three examples of efficacy of this approach in wide-field, two-photon, and confocal fluorescence microscopy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
Backbone and finetuning to train the light microscopy image enhancement model. (A) Previous methods generally train a separate model for every sample or microscopy type. Such training from scratch methods are effective but need many samples and an extended training time to reach optimal performance. Furthermore, since every training set is independent, the model cannot use the other samples or microscopy type to help the current imaging experiment. (B) Here we first train a backbone model from large, diverse, and previously curated data. The trained backbone model is then fine-tuned for every new experiment, using a much smaller amount of new data. Given an effective backbone model architecture, this method will be much faster in training, and allows reusing information acquired in previous experiments. Inspired by the success of transformer model in language pre-training, we propose a novel imaging transformer architecture, CNNT, to serve as the effective backbone.

**Figure 2**
The CNNT U-net architecture. (A) The whole model consists of pre and post convolution layers and the backbone. The input tensor has the size of [B, Z, C, H, W] for batch, depth, channel, height, and weight. The C input channel is first uplifted to 32 input channels into the backbone. The post-conv layer will convert the output tensor from the backbone to C channel. There is a long-term skip connection over the backbone. (B) The backbone has a Unet structure, consisting of two downsample blocks and two upsample blocks. Every downsample CNNT block will double the number of channels but reduce the spatial size by a factor of two. Every upsample block will reduce the number of channels and expand the spatial size by a factor of two. (C) The CNNT block includes only CNNT cells. Every cell contains CNN attention, instance norm and CNN mixer. This design mimics the standard transformer cell design but replaces the linear attention and mixers with CNN attention and CNN mixers, reducing computational cost for high resolution images. (D) The CNN attention is the key part of the imaging transformer cell. Unlike the linear layers in the standard transformer, the key, value, and query tensors are computed with convolution layers, which reduces computation cost of processing high-resolution images while also maintaining a good inductive bias. The attention coefficients are computed between query and key and applied to the value tensor to compute attention outputs.

**Figure 3**
Widefield microscopy experiment, imaging MEF cells. The pre-trained CNNT backbone was finetuned on 5 and 10 widefield image samples individually. The resulting model was compared to 3D-RCAN and Noise2Fast for image quality and computing time. (A) The low-quality noisy image as the input to the models. (B, C) The CNNT results after finetuning for 30 epochs on 5 and 10 samples. The quality improvement is noticeable. (D) The 3D-RCAN model trained from scratch for 300 epochs gave good improvement. (E) The Noise2Fast result is subpar. (F) The high-quality ground-truth for SSIM3D and PSNR computation and for reference. (G–L) Zoomed in versions of ghted parts in (A–F), respectively. The CNNT finetuning is much faster than 3D-RCAN training and Noise2Fast and offers better quality measurements.

**Figure 4**
Two-photon microscopy experiment, imaging pancreas of a zebrafish. (A) The low-quality image does not provide enough SNR and contrast to delineate the structural features of the pancreas. (B, C, D) The CNNT greatly improved the image quality when using 5, 10, and 20 training samples. The model is robust even for 5 samples, leading to a very fast ~ 3.5 min finetuning time. (E and F) The 3D-RCAN and Noise2Fast training times are much longer with suboptimal quality recovery. (G) The ground-truth in this experiment bears a still lower SNR. (H–N) Zoomed in versions of highlighted parts in (A–G), respectively. The CNNT models achieved better quality than the ground-truth images, which could be the result of pre-training.

**Figure 5**
Multi-average tests for the zebrafish imaging with repeated acquisition. (A) The imaging was repeated for N = 64 repetitions to image zebrafish liver and pancreas. Averaging the first n images creates the image Avg n. This gives us a series of images with gradual increase in SNR from Avg 1 to Avg 64. CNNT models were tested for robustness for different levels of input quality with increasing number of averages. The zebrafish liver are shown here. The predicted result for Avg 1 (the lowest quality input) shows residual noise, indicating the model “breaks” at this input SNR. Starting from the Avg 2, mode gives consistently good quality outputs. (B) The pancreas results are shown for Avg 1, 4, 8, 16, 32 and 64. In this case, the model was robust against the lower input SNR and recovered finer features.

**Figure 6**
Confocal microscopy, imaging of mouse lung tissue. (A) The low-quality image was acquired with very low photon counts. (B, C, D) CNNT finetuning with 5, 10, 20 samples show recovered tissue structures and removal of background random noise. (E) The 3D-RCAN model also gave good improvement in quality. (F) The Noise2Fast had more signal fluctuation, compared to supervised models. (G) The high-quality ground-truth acquisition reveals the tissue anatomical structure. (H–N) Zoomed in versions of highlighted parts in (A–G), respectively. Again, the timesaving of CNNT finetuning is prominent, with superior or similar image quality.

See this image and copyright information in PMC

Update of

Convolutional Neural Network Transformer (CNNT) for Fluorescence Microscopy image Denoising with Improved Generalization and Fast Adaptation.
Rehman A, Zhovmer A, Sato R, Mukoyama Y, Chen J, Rissone A, Puertollano R, Vishwasrao H, Shroff H, Combs CA, Xue H. Rehman A, et al. ArXiv [Preprint]. 2024 Apr 6:arXiv:2404.04726v1. ArXiv. 2024. Update in: Sci Rep. 2024 Aug 6;14(1):18184. doi: 10.1038/s41598-024-68918-2. PMID: 38903737 Free PMC article. Updated. Preprint.

References

1. Sahl, S. J., Hell, S. W. & Jakobs, S. Fluorescence nanoscopy in cell biology. Nat. Rev. Mol. Cell Biol.18, 685–701 (2017). 10.1038/nrm.2017.71 - DOI - PubMed
1. Daetwyler, S. & Fiolka, R. P. Light-sheets and smart microscopy, an exciting future is dawning. Commun. Biol.6, 1–11 (2023). 10.1038/s42003-023-04857-4 - DOI - PMC - PubMed
1. Laissue, P. P., Alghamdi, R. A., Tomancak, P., Reynaud, E. G. & Shroff, H. Assessing phototoxicity in live fluorescence imaging. Nat. Methods14, 657–661 (2017). 10.1038/nmeth.4344 - DOI - PubMed
1. Minaee, S. et al. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell.44, 3523–3542 (2022). - PubMed
1. Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively accelerates super-resolution localization microscopy. Nat. Biotechnol.36, 460–468 (2018). 10.1038/nbt.4106 - DOI - PubMed

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central

[1] Sahl, S. J., Hell, S. W. & Jakobs, S. Fluorescence nanoscopy in cell biology. Nat. Rev. Mol. Cell Biol.18, 685–701 (2017). 10.1038/nrm.2017.71 - DOI - PubMed

[2] Sahl, S. J., Hell, S. W. & Jakobs, S. Fluorescence nanoscopy in cell biology. Nat. Rev. Mol. Cell Biol.18, 685–701 (2017). 10.1038/nrm.2017.71 - DOI - PubMed

[3] Daetwyler, S. & Fiolka, R. P. Light-sheets and smart microscopy, an exciting future is dawning. Commun. Biol.6, 1–11 (2023). 10.1038/s42003-023-04857-4 - DOI - PMC - PubMed

[4] Daetwyler, S. & Fiolka, R. P. Light-sheets and smart microscopy, an exciting future is dawning. Commun. Biol.6, 1–11 (2023). 10.1038/s42003-023-04857-4 - DOI - PMC - PubMed

[5] Laissue, P. P., Alghamdi, R. A., Tomancak, P., Reynaud, E. G. & Shroff, H. Assessing phototoxicity in live fluorescence imaging. Nat. Methods14, 657–661 (2017). 10.1038/nmeth.4344 - DOI - PubMed

[6] Laissue, P. P., Alghamdi, R. A., Tomancak, P., Reynaud, E. G. & Shroff, H. Assessing phototoxicity in live fluorescence imaging. Nat. Methods14, 657–661 (2017). 10.1038/nmeth.4344 - DOI - PubMed

[7] Minaee, S. et al. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell.44, 3523–3542 (2022). - PubMed

[8] Minaee, S. et al. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell.44, 3523–3542 (2022). - PubMed

[9] Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively accelerates super-resolution localization microscopy. Nat. Biotechnol.36, 460–468 (2018). 10.1038/nbt.4106 - DOI - PubMed

[10] Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively accelerates super-resolution localization microscopy. Nat. Biotechnol.36, 460–468 (2018). 10.1038/nbt.4106 - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Convolutional neural network transformer (CNNT) for fluorescence microscopy image denoising with improved generalization and fast adaptation

Affiliations

Convolutional neural network transformer (CNNT) for fluorescence microscopy image denoising with improved generalization and fast adaptation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

Similar articles

References

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Update of

Similar articles

References

Related information

LinkOut - more resources

Full Text Sources