Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Sep;160(3):223-251.
doi: 10.1007/s00418-023-02209-1. Epub 2023 Jul 10.

OME-Zarr: a cloud-optimized bioimaging file format with international community support

Josh Moore  1 Daniela Basurto-Lozada  2 Sébastien Besson  3 John Bogovic  4 Jordão Bragantini  5 Eva M Brown  6 Jean-Marie Burel  3 Xavier Casas Moreno  7 Gustavo de Medeiros  8 Erin E Diel  9 David Gault  3 Satrajit S Ghosh  10 Ilan Gold  11 Yaroslav O Halchenko  12 Matthew Hartley  13 Dave Horsfall  2 Mark S Keller  11 Mark Kittisopikul  4 Gabor Kovacs  14 Aybüke Küpcü Yoldaş  13 Koji Kyoda  15 Albane le Tournoulx de la Villegeorges  16 Tong Li  17 Prisca Liberali  8 Dominik Lindner  3 Melissa Linkert  9 Joel Lüthi  8 Jeremy Maitin-Shepard  18 Trevor Manz  11 Luca Marconato  19 Matthew McCormick  20 Merlin Lange  5 Khaled Mohamed  3 William Moore  3 Nils Norlin  21 Wei Ouyang  7 Bugra Özdemir  22 Giovanni Palla  23 Constantin Pape  24 Lucas Pelkmans  25 Tobias Pietzsch  4 Stephan Preibisch  4 Martin Prete  17 Norman Rzepka  16 Sameeul Samee  26 Nicholas Schaub  27 Hythem Sidky  26 Ahmet Can Solak  5 David R Stirling  9 Jonathan Striebel  16 Christian Tischer  28 Daniel Toloudis  6 Isaac Virshup  23 Petr Walczysko  3 Alan M Watson  29 Erin Weisbart  30 Frances Wong  3 Kevin A Yamauchi  31 Omer Bayraktar  17 Beth A Cimini  30 Nils Gehlenborg  11 Muzlifah Haniffa  17 Nathan Hotaling  27 Shuichi Onami  15 Loic A Royer  5 Stephan Saalfeld  4 Oliver Stegle  19 Fabian J Theis  23 Jason R Swedlow  3
Affiliations

OME-Zarr: a cloud-optimized bioimaging file format with international community support

Josh Moore et al. Histochem Cell Biol. 2023 Sep.

Abstract

A growing community is constructing a next-generation file format (NGFF) for bioimaging to overcome problems of scalability and heterogeneity. Organized by the Open Microscopy Environment (OME), individuals and institutes across diverse modalities facing these problems have designed a format specification process (OME-NGFF) to address these needs. This paper brings together a wide range of those community members to describe the cloud-optimized format itself-OME-Zarr-along with tools and data resources available today to increase FAIR access and remove barriers in the scientific process. The current momentum offers an opportunity to unify a key component of the bioimaging domain-the file format that underlies so many personal, institutional, and global data management and analysis tasks.

Keywords: Bioimaging; Cloud; Community; Data; FAIR; Format.

PubMed Disclaimer

Conflict of interest statement

S.B., E.D., M.L., D.R.S. and J.R.S. are affiliated with Glencoe Software, a commercial company that builds, delivers, supports and integrates image data management systems across academic, biotech and pharmaceutical industries; J.M. and W.M. also hold equity in Glencoe Software. M.M. is affiliated with Kitware, Inc., a commercial company built around open-source platforms that provides advanced technical computing, state-of-the-art AI, and tailored software solutions to academic, government, and industrial customers. A.V., J.S. and N.R. are affiliated with Scalable Minds, a commercial company that builds, delivers, supports and integrates image analysis solutions. F.J.T. consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd, Cellarity, and Omniscope Ltd, and has ownership interest in Dermagnostix GmbH and Cellarity. N.A.H. and N.J.S. are contractors who work for Axle Research and Technology. S.B.S. and H.S. are affiliated with Axle Research and Technology and are contracted to the National Center for Advancing Translational Science, NIH. The remaining authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported.

Figures

Fig. 1
Fig. 1
A common format enables a diverse set of use cases via a consistent API. A wide range of modalities can be converted into a representation that can be equally accessed by a variety of tools. This format can be used to download entire datasets for local processing, to stream pyramidal sub-resolutions for interactive viewing or to process entire resolutions in parallel. OME-Zarr data shown includes a idr0076 (Ali et al. 2020), b idr0101 (Payne et al. 2021), c idr0077 (Valuchova et al. 2020), d S-BIAD548 (Lim et al. 2023), e S-BIAD217 (de Boer et al. 2020), and f S-BIAD501 (Igarashi et al. 2015)
Fig. 2
Fig. 2
By making use of an annotated hierarchy of arrays, OME-Zarr can represent complex relationships between images, capture the multiple resolutions of an image pyramid, and provide tunable chunk size and compression all within a single abstraction layer that can be saved as a directory of files on disk or shared remotely. a Each level of nested directories provides a different level of abstraction: the top-level directory can represent an entire 1 Terabyte plate with more than 100,000 pixels in the X and Y dimensions, while the lowest level directory represents individual chunks of N-dimensional data as small as 1 Megabyte. b In the example shown, a concatenation of low-resolution images produces a 2560 pixels × 1822 pixel representation of the entire plate, followed by similar examples of how many pixels must be loaded by a client at each zoom level
Fig. 3
Fig. 3
The original viewers of OME-Zarr published in Moore et al. , from left to right BigDataViewer, napari, and Vizarr, here seen loading a view of the same EM volume of a 6 day old Platynereis larva from (Vergara et al. 2020) available at https://s3.embl.de/i2k-2020/platy-raw.ome.zarr. These three applications provided broad coverage over the most common bioimaging platforms like Fiji and napari but critically also a web viewer that could stream data on the fly
Fig. 4
Fig. 4
Advanced GPU Accelerated Volume Explorer (AGAVE) displaying a downsampled level from a multi-terabyte mouse brain OME-Zarr dataset. The number of pixels actually loaded is displayed at lower right. The full resolution data is 47,310 × 20,344 × 18,471 which consumes about 33 TB. The ability to quickly access multiresolution data makes low latency interactive visualization possible
Fig. 5
Fig. 5
ITKWidgets 3D rendering an OME-Zarr for IDR 0062A in Jupyter. Interactive features shown include volume rendering, slicing planes, and interactive widgets to adjust rendering parameters and slice planes indices
Fig. 6
Fig. 6
Neuroglancer rendering the same OME-Zarr from IDR 0062A as Fig. 5
Fig. 7
Fig. 7
OME-NGFF Validator validating the same image from IDR 0062A on the left, and on the right providing a summary of the size of the data as well as providing a quick visualization of a single plane
Fig. 8
Fig. 8
webknossos loading an EM volume of a 6 day old Platynereis larva from (Vergara et al. 2020) with a manually added segmentation. The web accessible version is accessible at https://wklink.org/6422
Fig. 9
Fig. 9
Allen Cell Volume Viewer displaying a multichannel fluorescence image of gene edited hiPSC cells via a downsampled level of an OME-Zarr converted from the dataset found at https://cfe.allencell.org/?dataset=variance_v1
Fig. 10
Fig. 10
Example application of BigStitcher’s interest point based registration on one of the first “exaSPIM” lightsheet microscope datasets acquired at the Allen Institute for Neural Dynamics (sample 609,281, available at the link in Table 2). The overlapping regions of two tiles are shown in BDV at their nominal (left) and aligned (right) locations. This large scale NGFF dataset consists of 54 tiles with dimensions of 24,576 × 10,656 × 2048 voxels (about 1 TB raw size) each
Fig. 11
Fig. 11
Visualization of a MERFISH mouse brain dataset [Allen Institute prototype MERFISH pipeline (Long et al. 2023)] via the napari-spatialdata plugin, featuring single-molecule transcripts (points) and their rasterized representation (image), polygonal ROIs, and annotated cells approximated as circles with variable radii. The dataset has been converted to OME-Zarr with the SpatialData APIs
Fig. 12
Fig. 12
NGFF-Converter GUI showing a sample of input formats being converted to OME-Zarr
Fig. 13
Fig. 13
An OME-Zarr dataset on DANDI (ID: DANDI:000108) contains multiscale 5D datasets (time, channel, z, y, x), with metadata, scale, and transformations. a DANDI:000108 includes multi-slab, multi-stain (NeuropeptideY-NPY, Calretinin-CR, YOYO1) data that are aligned and the coordinate transforms are stored in the Zarr files, allowing b on-the-fly visualization and stitching of the slabs at multiple scales (two shown) using Neuroglancer. c An HTML dashboard allows data submitter and any user on the Web to see the live status: samples + stains uploaded, their quality issues, and multiscale viewing of this 300 + TB dataset
Fig. 14
Fig. 14
a ‘Imaging’ tab of the zebrahub.org website showing the list of imaging datasets made available. b By clicking on a particular dataset, the user is directed to a neuroglancer instance that allows interactive exploration of the dataset
Fig. 15
Fig. 15
Multiscale OME-Zarr representation of a whole brain dataset (https://download.brainimagelibrary.org/2b/da/2bdaf9e66a246844/mouseID_405429-182725/) (~ 6 TB compressed) archived at BIL and visualized over the internet in napari via the napari-ome-zarr plugin. Data is stored at BIL in an alternative format and is dynamically converted to OME-Zarr chunk-by-chunk and delivered to clients upon request

Update of

  • OME-Zarr: a cloud-optimized bioimaging file format with international community support.
    Moore J, Basurto-Lozada D, Besson S, Bogovic J, Bragantini J, Brown EM, Burel JM, Moreno XC, de Medeiros G, Diel EE, Gault D, Ghosh SS, Gold I, Halchenko YO, Hartley M, Horsfall D, Keller MS, Kittisopikul M, Kovacs G, Yoldaş AK, Kyoda K, de la Villegeorges ALT, Li T, Liberali P, Lindner D, Linkert M, Lüthi J, Maitin-Shepard J, Manz T, Marconato L, McCormick M, Lange M, Mohamed K, Moore W, Norlin N, Ouyang W, Özdemir B, Palla G, Pape C, Pelkmans L, Pietzsch T, Preibisch S, Prete M, Rzepka N, Samee S, Schaub N, Sidky H, Solak AC, Stirling DR, Striebel J, Tischer C, Toloudis D, Virshup I, Walczysko P, Watson AM, Weisbart E, Wong F, Yamauchi KA, Bayraktar O, Cimini BA, Gehlenborg N, Haniffa M, Hotaling N, Onami S, Royer LA, Saalfeld S, Stegle O, Theis FJ, Swedlow JR. Moore J, et al. bioRxiv [Preprint]. 2023 May 7:2023.02.17.528834. doi: 10.1101/2023.02.17.528834. bioRxiv. 2023. Update in: Histochem Cell Biol. 2023 Sep;160(3):223-251. doi: 10.1007/s00418-023-02209-1. PMID: 36865282 Free PMC article. Updated. Preprint.

Similar articles

  • OME-Zarr: a cloud-optimized bioimaging file format with international community support.
    Moore J, Basurto-Lozada D, Besson S, Bogovic J, Bragantini J, Brown EM, Burel JM, Moreno XC, de Medeiros G, Diel EE, Gault D, Ghosh SS, Gold I, Halchenko YO, Hartley M, Horsfall D, Keller MS, Kittisopikul M, Kovacs G, Yoldaş AK, Kyoda K, de la Villegeorges ALT, Li T, Liberali P, Lindner D, Linkert M, Lüthi J, Maitin-Shepard J, Manz T, Marconato L, McCormick M, Lange M, Mohamed K, Moore W, Norlin N, Ouyang W, Özdemir B, Palla G, Pape C, Pelkmans L, Pietzsch T, Preibisch S, Prete M, Rzepka N, Samee S, Schaub N, Sidky H, Solak AC, Stirling DR, Striebel J, Tischer C, Toloudis D, Virshup I, Walczysko P, Watson AM, Weisbart E, Wong F, Yamauchi KA, Bayraktar O, Cimini BA, Gehlenborg N, Haniffa M, Hotaling N, Onami S, Royer LA, Saalfeld S, Stegle O, Theis FJ, Swedlow JR. Moore J, et al. bioRxiv [Preprint]. 2023 May 7:2023.02.17.528834. doi: 10.1101/2023.02.17.528834. bioRxiv. 2023. Update in: Histochem Cell Biol. 2023 Sep;160(3):223-251. doi: 10.1007/s00418-023-02209-1. PMID: 36865282 Free PMC article. Updated. Preprint.
  • OME-NGFF: a next-generation file format for expanding bioimaging data-access strategies.
    Moore J, Allan C, Besson S, Burel JM, Diel E, Gault D, Kozlowski K, Lindner D, Linkert M, Manz T, Moore W, Pape C, Tischer C, Swedlow JR. Moore J, et al. Nat Methods. 2021 Dec;18(12):1496-1498. doi: 10.1038/s41592-021-01326-w. Epub 2021 Nov 29. Nat Methods. 2021. PMID: 34845388 Free PMC article.
  • Toward scalable reuse of vEM data: OME-Zarr to the rescue.
    Rzepka N, Bogovic JA, Moore JA. Rzepka N, et al. Methods Cell Biol. 2023;177:359-387. doi: 10.1016/bs.mcb.2023.01.016. Epub 2023 Mar 9. Methods Cell Biol. 2023. PMID: 37451774
  • A practical guide to bioimaging research data management in core facilities.
    Schmidt C, Boissonnet T, Dohle J, Bernhardt K, Ferrando-May E, Wernet T, Nitschke R, Kunis S, Weidtkamp-Peters S. Schmidt C, et al. J Microsc. 2024 Jun;294(3):350-371. doi: 10.1111/jmi.13317. Epub 2024 May 16. J Microsc. 2024. PMID: 38752662 Review.
  • The Minderoo-Monaco Commission on Plastics and Human Health.
    Landrigan PJ, Raps H, Cropper M, Bald C, Brunner M, Canonizado EM, Charles D, Chiles TC, Donohue MJ, Enck J, Fenichel P, Fleming LE, Ferrier-Pages C, Fordham R, Gozt A, Griffin C, Hahn ME, Haryanto B, Hixson R, Ianelli H, James BD, Kumar P, Laborde A, Law KL, Martin K, Mu J, Mulders Y, Mustapha A, Niu J, Pahl S, Park Y, Pedrotti ML, Pitt JA, Ruchirawat M, Seewoo BJ, Spring M, Stegeman JJ, Suk W, Symeonides C, Takada H, Thompson RC, Vicini A, Wang Z, Whitman E, Wirth D, Wolff M, Yousuf AK, Dunlop S. Landrigan PJ, et al. Ann Glob Health. 2023 Mar 21;89(1):23. doi: 10.5334/aogh.4056. eCollection 2023. Ann Glob Health. 2023. PMID: 36969097 Free PMC article. Review.

Cited by

References

    1. Ali HR, Jackson HW, Zanotelli VRT, et al. Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer. Nat Cancer. 2020;1:163–175. doi: 10.1038/s43018-020-0026-6. - DOI - PubMed
    1. Allan C, Burel J-M, Moore J, et al. OMERO: flexible, model-driven data management for experimental biology. Nat Methods. 2012;9:245–253. doi: 10.1038/nmeth.1896. - DOI - PMC - PubMed
    1. Alted F. Why modern CPUs are starving and what can be done about it. Comput Sci Eng. 2010;12:68–71. doi: 10.1109/MCSE.2010.51. - DOI
    1. Bahry E, Breimann L, Zouinkhi M, et al. RS-FISH: precise, interactive, fast, and scalable FISH spot detection. Nat Methods. 2022;19:1563–1567. doi: 10.1038/s41592-022-01669-y. - DOI - PMC - PubMed
    1. Berman HM, Kleywegt GJ, Nakamura H, Markley JL. The Protein Data Bank at 40: reflecting on the past to prepare for the future. Structure. 2012;20:391–396. doi: 10.1016/j.str.2012.01.010. - DOI - PMC - PubMed

LinkOut - more resources