Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec;18(12):1496-1498.
doi: 10.1038/s41592-021-01326-w. Epub 2021 Nov 29.

OME-NGFF: a next-generation file format for expanding bioimaging data-access strategies

Affiliations

OME-NGFF: a next-generation file format for expanding bioimaging data-access strategies

Josh Moore et al. Nat Methods. 2021 Dec.

Abstract

The rapid pace of innovation in biological imaging and the diversity of its applications have prevented the establishment of a community-agreed standardized data format. We propose that complementing established open formats such as OME-TIFF and HDF5 with a next-generation file format such as Zarr will satisfy the majority of use cases in bioimaging. Critically, a common metadata format used in all these vessels can deliver truly findable, accessible, interoperable and reusable bioimaging data.

PubMed Disclaimer

Conflict of interest statement

C.A., E.D., K.K., M.L. and J.R.S. are affiliated with Glencoe Software, a commercial company that builds, delivers, supports and integrates image data management systems across academic, biotech and pharmaceutical industries. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Chunk retrieval time is less sensitive to data location with next-generation file formats.
a,b, Random sampling of 100 chunks from synthetically generated, five-dimensional images measures access times for three different formats on the same file system (green), over HTTP using the nginx web server (orange) and using Amazon’s proprietary S3 object storage protocol (blue) under two scenarios: a whole-slide CycIF imaging dataset with many large planes of data (x = 64,000, y = 64,000, c = 8) and chunks of 256 × 256 pixels (128 KB) (a); and a time-lapse LSM dataset with isotropic dimensions (x = 1,024, y = 1,024, z = 1,024, t = 100) and chunks of 32 × 32 × 32 pixels (64 KB) (b) (Methods).
Extended Data Fig. 1
Extended Data Fig. 1. Maximizing re-use by allowing popular tools to access bioimaging data in the cloud.
An example of using NGFFs for promoting the distribution of public image datasets. Selection of current tools streaming different portions of the same SARS-CoV-2 virus image at various resolutions directly from S3 storage at the European Bioinformatics Institute (EBI). Original data from Lamers et al. is available in IDR while the converted data is available on Zenodo.21–23.
Extended Data Fig. 2
Extended Data Fig. 2. Effect of Chunk Size on Chunk Number.
For each modality, the chunk size of the benchmark dataset was chosen as a compromise between the size of individual chunks and the total number of chunks in the Zarr dataset. The plots above show typical power of 2 chunk sizes: between 32 and 1024 for the 2D data and between 16 and 128 for the 3D data. We chose a 2D chunk size of 256×256 for the CyIF-like dataset and a 3D chunk size of 32x32x32 for the LSM-like dataset. Note that due to the planar limitation of TIFF, the LSM dataset was stored as 2D TIFF tiles of size 32×32 but the benchmark looped over 32 tiles to measure the access time of the same chunk size.
Extended Data Fig. 3
Extended Data Fig. 3. Conversion tools provide an alternative to continual, on-the-fly translation of PFFs.
Figure shows workflows for file format access. (a) The classical approach to access images produced by an acquisition system is to use a library like Bio-Formats to translate the proprietary file format (PFF) and produce an in-memory copy of the imaging data on-the-fly. This translation needs to be repeated on every use. (b) With the existence of open, community-supported formats, converting PFFs becomes the most cost-efficient method for long-term storage and sharing of microscopy data. bioformats2raw and raw2ometiff (Supplementary Note) parallelize the creation of an open format, OME-TIFF, by using an intermediate NGFF format consisting of many, individual files each with one chunk of the original image data.
Extended Data Fig. 4
Extended Data Fig. 4. Unification of metadata specifications will allow interoperability between TIFF, HDF5, and Zarr.
Each proposed container (TIFF, Zarr, HDF5) can be used interchangeably to store pixel data, but trade-offs described in this manuscript can be used to determine what is the best target. TIFF is ideal for interoperability in digital pathology and other 2-dimensional domains since the format is widely accessible by established open source and proprietary software. In higher-dimensional domains, HDF5 and Zarr are better suited. HDF5 will likely be preferred for local access. If data is intended for sharing in the cloud, Zarr will likely be preferred. High throughput image analysis will benefit from the lower-latency access to data in HDF5 and Zarr. If original image data is paired with derived representations like pixel or object classification, a shared structure in HDF5 or Zarr is likely the best choice.

Similar articles

  • OME-Zarr: a cloud-optimized bioimaging file format with international community support.
    Moore J, Basurto-Lozada D, Besson S, Bogovic J, Bragantini J, Brown EM, Burel JM, Casas Moreno X, de Medeiros G, Diel EE, Gault D, Ghosh SS, Gold I, Halchenko YO, Hartley M, Horsfall D, Keller MS, Kittisopikul M, Kovacs G, Küpcü Yoldaş A, Kyoda K, le Tournoulx de la Villegeorges A, Li T, Liberali P, Lindner D, Linkert M, Lüthi J, Maitin-Shepard J, Manz T, Marconato L, McCormick M, Lange M, Mohamed K, Moore W, Norlin N, Ouyang W, Özdemir B, Palla G, Pape C, Pelkmans L, Pietzsch T, Preibisch S, Prete M, Rzepka N, Samee S, Schaub N, Sidky H, Solak AC, Stirling DR, Striebel J, Tischer C, Toloudis D, Virshup I, Walczysko P, Watson AM, Weisbart E, Wong F, Yamauchi KA, Bayraktar O, Cimini BA, Gehlenborg N, Haniffa M, Hotaling N, Onami S, Royer LA, Saalfeld S, Stegle O, Theis FJ, Swedlow JR. Moore J, et al. Histochem Cell Biol. 2023 Sep;160(3):223-251. doi: 10.1007/s00418-023-02209-1. Epub 2023 Jul 10. Histochem Cell Biol. 2023. PMID: 37428210 Free PMC article.
  • Toward scalable reuse of vEM data: OME-Zarr to the rescue.
    Rzepka N, Bogovic JA, Moore JA. Rzepka N, et al. Methods Cell Biol. 2023;177:359-387. doi: 10.1016/bs.mcb.2023.01.016. Epub 2023 Mar 9. Methods Cell Biol. 2023. PMID: 37451774
  • Metadata matters: access to image data in the real world.
    Linkert M, Rueden CT, Allan C, Burel JM, Moore W, Patterson A, Loranger B, Moore J, Neves C, Macdonald D, Tarkowska A, Sticco C, Hill E, Rossner M, Eliceiri KW, Swedlow JR. Linkert M, et al. J Cell Biol. 2010 May 31;189(5):777-82. doi: 10.1083/jcb.201004104. J Cell Biol. 2010. PMID: 20513764 Free PMC article.
  • Common file formats.
    Leonard SA, Littlejohn TG, Baxevanis AD. Leonard SA, et al. Curr Protoc Bioinformatics. 2007 Jan;Appendix 1:Appendix 1B. doi: 10.1002/0471250953.bia01bs16. Curr Protoc Bioinformatics. 2007. PMID: 18428774 Review.
  • Interoperability with Moby 1.0--it's better than sharing your toothbrush!
    BioMoby Consortium; Wilkinson MD, Senger M, Kawas E, Bruskiewich R, Gouzy J, Noirot C, Bardou P, Ng A, Haase D, Saiz Ede A, Wang D, Gibbons F, Gordon PM, Sensen CW, Carrasco JM, Fernández JM, Shen L, Links M, Ng M, Opushneva N, Neerincx PB, Leunissen JA, Ernst R, Twigger S, Usadel B, Good B, Wong Y, Stein L, Crosby W, Karlsson J, Royo R, Párraga I, Ramírez S, Gelpi JL, Trelles O, Pisano DG, Jimenez N, Kerhornou A, Rosset R, Zamacola L, Tarraga J, Huerta-Cepas J, Carazo JM, Dopazo J, Guigo R, Navarro A, Orozco M, Valencia A, Claros MG, Pérez AJ, Aldana J, Rojano M, Fernandez-Santa Cruz R, Navas I, Schiltz G, Farmer A, Gessler D, Schoof H, Groscurth A. BioMoby Consortium, et al. Brief Bioinform. 2008 May;9(3):220-31. doi: 10.1093/bib/bbn003. Epub 2008 Jan 31. Brief Bioinform. 2008. PMID: 18238804 Review.

Cited by

  • SciJava Ops: an improved algorithms framework for Fiji and beyond.
    Selzer GJ, Rueden CT, Hiner MC, Evans EL, Kolb D, Wiedenmann M, Birkhold C, Buchholz TO, Helfrich S, Northan B, Walter A, Schindelin J, Pietzsch T, Saalfeld S, Berthold MR, Eliceiri KW. Selzer GJ, et al. Front Bioinform. 2024 Sep 27;4:1435733. doi: 10.3389/fbinf.2024.1435733. eCollection 2024. Front Bioinform. 2024. PMID: 39399098 Free PMC article.
  • Arkitekt: streaming analysis and real-time workflows for microscopy.
    Roos J, Bancelin S, Delaire T, Wilhelmi A, Levet F, Engelhardt M, Viasnoff V, Galland R, Nägerl UV, Sibarita JB. Roos J, et al. Nat Methods. 2024 Oct;21(10):1884-1894. doi: 10.1038/s41592-024-02404-5. Epub 2024 Sep 18. Nat Methods. 2024. PMID: 39294366
  • Ultrack: pushing the limits of cell tracking across biological scales.
    Bragantini J, Theodoro I, Zhao X, Huijben TAPM, Hirata-Miyasaki E, VijayKumar S, Balasubramanian A, Lao T, Agrawal R, Xiao S, Lammerding J, Mehta S, Falcão AX, Jacobo A, Lange M, Royer LA. Bragantini J, et al. bioRxiv [Preprint]. 2024 Sep 3:2024.09.02.610652. doi: 10.1101/2024.09.02.610652. bioRxiv. 2024. PMID: 39282368 Free PMC article. Preprint.
  • Navigate: an open-source platform for smart light-sheet microscopy.
    Marin Z, Wang X, Collison DW, McFadden C, Lin J, Borges HM, Chen B, Mehra D, Shen Q, Gałecki S, Daetwyler S, Sheppard SJ, Thien P, Porter BA, Conzen SD, Shepherd DP, Fiolka R, Dean KM. Marin Z, et al. Nat Methods. 2024 Sep 11. doi: 10.1038/s41592-024-02413-4. Online ahead of print. Nat Methods. 2024. PMID: 39261640 No abstract available.
  • BIOMERO: A scalable and extensible image analysis framework.
    Luik TT, Rosas-Bertolini R, Reits EAJ, Hoebe RA, Krawczyk PM. Luik TT, et al. Patterns (N Y). 2024 Jul 18;5(8):101024. doi: 10.1016/j.patter.2024.101024. eCollection 2024 Aug 9. Patterns (N Y). 2024. PMID: 39233696 Free PMC article.

References

    1. Wilkinson MD, et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data. 2016;3:160018. doi: 10.1038/sdata.2016.18. - DOI - PMC - PubMed
    1. Ellenberg J, et al. A call for public archives for biological image data. Nat. Methods. 2018;15:849–854. doi: 10.1038/s41592-018-0195-8. - DOI - PMC - PubMed
    1. Linkert M, et al. Metadata matters: access to image data in the real world. J. Cell Biol. 2010;189:777–782. doi: 10.1083/jcb.201004104. - DOI - PMC - PubMed
    1. The HDF5 Library and File Format (The HDF Group, accessed 18 October 2021); https://www.hdfgroup.org/solutions/hdf5/
    1. Miles, A. et al. zarr-developers/zarr-python: v.2.5.0 10.5281/zenodo.4069231 (2020).

Publication types

LinkOut - more resources