Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multi-channel labels #19

Open
tischi opened this issue Nov 28, 2020 · 17 comments
Open

Support for multi-channel labels #19

tischi opened this issue Nov 28, 2020 · 17 comments

Comments

@tischi
Copy link

tischi commented Nov 28, 2020

@will-moore @joshmoore @constantinpape

What is meaning of the channel dimension for the label images?

I could imagine:

  1. It must be a singleton dimension, where only channel 0 exists
  2. If the intensity image has multiple channels, each channel could have its own segmentation (label-image), and the channel dimension of the label image corresponds to the channel dimension of the intensity image

Is there already a spec for this?

@tischi
Copy link
Author

tischi commented Nov 28, 2020

I think the option (2) may not make too much sense, because (i) segmentations could be obtained using all channels (e.g. in a machine learning setting) and (ii) if, e.g., only channel 0 and 5 have a segmentation we would have to store label images also for all the other inbetween channels.

@will-moore
Copy link
Member

Not sure I know what you mean by "channel dimension for the label images" but it sounds like coming from OMERO, where each Shape can have C index (optional) if you wish to indicate which channel in the origin image it is associated with (e.g. segmented from). But I don't think it appears in the OME-Zarr spec (unless I've missed it)?

@tischi
Copy link
Author

tischi commented Nov 28, 2020

@will-moore
According to the current spec, the label images are 5D (t,c,z,y,x), and thus have a channel dimension.
I think I found the spec here: https://ngff.openmicroscopy.org/latest/#citing

    └── labels
        │
        ├── .zgroup           # The labels group is a container which holds a list of labels to make the objects easily discoverable
        │
        ├── .zattrs           # All labels will be listed in <code data-opaque bs-autolink-syntax='`.zattrs`'>.zattrs</code> e.g. <code data-opaque bs-autolink-syntax='`{ &quot;labels&quot;: [ &quot;original/0&quot; ] }`'>{ "labels": [ "original/0" ] }</code>
        │                     # Each dimension of the label <code data-opaque bs-autolink-syntax='`(t, c, z, y, x)`'>(t, c, z, y, x)</code> should be either the same as the
        │                     # corresponding dimension of the image, or <code data-opaque bs-autolink-syntax='`1`'>1</code> if that dimension of the label
        │                     # is irrelevant.
        │

I think one could interpret this as suggesting that the channel dimension should be a singleton, but I think it could be clearer.
What do you think?

@will-moore
Copy link
Member

Ah, yes sorry. I guess we 'lose' the channel dimension when we open in napari since each image channel is split into a separate 4D layer, and then the labels are another 4D layer.
I don't think we have any examples where we have labels with multi-C dimension. In napari, I don't think we'd have any way of 'linking' a labels layer (one channel of a label) with the corresponding channel of the image (another layer), except maybe by naming them in the same way.

@tischi
Copy link
Author

tischi commented Nov 29, 2020

In napari, I don't think we'd have any way of 'linking' a labels layer (one channel of a label) with the corresponding channel of the image (another layer), except maybe by naming them in the same way.

In BDV it is the same.

@thewtex
Copy link
Contributor

thewtex commented May 4, 2021

Can overlapping labels be specified through multiple "channels"?

CC @lassoan

@joshmoore
Copy link
Member

joshmoore commented May 5, 2021

I think this was largely an "implementation restriction" since napari was the only viewer currently handling OME-Zarr labels and it couldn't use the channel information. If everyone's on board, I think it makes sense to add support (or specify that labels are single channel only)

cc: @jni @tlambert03 @sofroniewn @manzt

Edit: I should clarify before @tischi started implementing which led to this issue.

@jni
Copy link

jni commented May 17, 2021

Sorry for slow response. For napari it'll be some time before we handle overlapping labels, but it's been requested a couple of times before so I don't want us to be the blocking implementation here! It would make sense for ome-zarr to allow channels support, and the napari plugin can simply return a list of 4D labels layers. We currently scale poorly with many layers but it would "work", and we are always working on those scalability issues.

@lassoan
Copy link

lassoan commented May 17, 2021

In 3D Slicer, each non-overlapping group of segments is stored in a 3D volume (we call this a "layer", I think it is referred to as "channel" above). If all segments are non-overlapping then the segmentation is a 3D volume, otherwise it is a 4D volume. We rarely encounter the need for a a 5th dimension, but sometimes it comes up. I don't remember anyone asking for a 6th dimension in the past 10 years. So, specifying segmentation as up to 5D (t,c,z,y,x), sounds good.

Currently, we store the following metadata per segment:

  • channel index (index of the 3D volume within the 4D array, if it is a 4D array)
  • label value (label value within the 3D volume)
  • id (machine-readable identifier unique within the segmentation)
  • name (human-readable name) + auto-generated flag (if name is set automatically from a preset or a custom name entered by the user)
  • rgb color + auto-generated flag (if color is auto-generated from a preset or the user has specifically set it)
  • extent (xmin, xmax, ymin, ymax, zmin, zmax in voxel coordinates; to be able to quickly extract small segments from a large volume)
  • tags (key/value pairs): it is used for example to describe the content of the segment using standard terminologies (using 3 strings: coding scheme, code value, and code meaning; which allows lossless storage of segments imported from DICOM)

It would great if we could standardize as many fields of the above as possible, but at least agree in that we allow storing non-overlapping segments in one channel and allow storing multiple channels (and define metadata fields for specifying channel index and label value for each segment).

@joshmoore joshmoore changed the title Labels question Support for multi-channel labels May 19, 2021
@0x00b1
Copy link

0x00b1 commented May 19, 2021

I started work on napari/napari#269.

Labels should, in my opinion, use the representation that is both ubiquitous in computer vision research and machine learning libraries like PyTorch and TensorFlow: (n, r, c) of bool or uint.

@lassoan
Copy link

lassoan commented May 19, 2021

I cannot comment on what is common in computer vision, but in medical imaging labelmap volume is the standard (3D volume with char or short voxel value specifying what structure is there). Overlapping label support is not that common, but typical solution is 4D labelmap volume. Since you often have atlases with hundreds of labels, bool voxels are not generally usable.

We obviously will not be able to find a single organization of label data that works for everybody, so if we want this file format to see wide adoption then it should allow specification of the meaning of each axis of the label array.

@0x00b1
Copy link

0x00b1 commented May 19, 2021

@lassoan For sure. This was the common structure in computer vision too. But this changed, like everything else in the past decade, when learned-based methods became standard. Think about overlapping objects from a y_pred rather than a y_true perspective. Your ground truth, y_true, may have exactly one value per unit (pixel, voxel, or whatever) but your prediction certainly won't. Your data structure, in my opinion, should reflect the probabilistic nature of contemporary methods.

@0x00b1
Copy link

0x00b1 commented May 19, 2021

@lassoan Your comment is really interesting! I should confess that I know absolutely nothing about microscopy!

I don't remember anyone asking for a 6th dimension in the past 10 years. So, specifying segmentation as up to 5D (t,c,z,y,x), sounds good.

As far as I know, I too have not personally run into this issue in biological contexts but it has become increasingly common in non-biological contexts (e.g. robotics). Hell, my new iPhone 12 Pro Max, for whatever reason, has a LiDAR sensor. 🤷‍♂️

You can also imagine a situation where embeddings are packed alongside the pixel information, e.g.

(frames, planes, features, rows, columns, channels)

I believe Carolina Wählby experimented with this.

@lassoan
Copy link

lassoan commented May 19, 2021

@lassoan For sure. This was the common structure in computer vision too. But this changed, like everything else in the past decade, when learned-based methods became standard. Think about overlapping objects from a y_pred rather than a y_true perspective. Your ground truth, y_true, may have exactly one value per unit (pixel, voxel, or whatever) but your prediction certainly won't. Your data structure, in my opinion, should reflect the probabilistic nature of contemporary methods.

In 3D Slicer, we implemented all the mentioned representations and some more (3D labelmap, 4D labelmap, 4D fractional labelmap; and - primarily for 3D display - closed surface, planar contours, and ribbons; see overview here) along with automatic conversion algorithms between them and visualization and editing in both 2D and 3D.

We thought that fractional labelmaps (4D volume, each voxel describes some kind of probability) would be very useful and worked a lot on implementing first-class support for them (interactive editing and visualization, GPU-accelerated supersampling conversion, etc.). Surprisingly, it is barely used. Even though most ML prediction results are kind of probabilistic, it seems that by the time it gets to be displayed to end users, the results are usually already converted to labelmap or binary image. Trends can change quickly though, so I agree that the file format should be able to handle fractional labelmaps well.

@constantinpape
Copy link
Contributor

constantinpape commented May 20, 2021

I think parts of the discussion here moved away slightly from the original question about multi-channel support for labels.

Labels should, in my opinion, use the representation that is both ubiquitous in computer vision research and machine learning libraries like PyTorch and TensorFlow: (n, r, c) of bool or uint.

I think this is related to the general question of how to specify axes / dimensions in the NGFF format.
I don't think that it would be a good idea to introduce a separate nomenclature for labels here.
There is currently PR #46 in progress to introduce axes labels. Note that this is still fairly limited (only allowing x, y, z, c, t) but this can certainly be extended further, see discussion in #35 and also related #28 (all extensions should be non-breaking with #46 though).

Think about overlapping objects from a y_pred rather than a y_true perspective. Your ground truth, y_true, may have exactly one value per unit (pixel, voxel, or whatever) but your prediction certainly won't. Your data structure, in my opinion, should reflect the probabilistic nature of contemporary methods.

I agree that being able to represent probabilistic predictions is important. But I would see this in a different category than the labels discussed here; for many downstream analysis tasks having a "regular" label map will be prerequisite. For now, probability maps can be stored following the "normal" NGFF data definition. We could think about some additional metadata for it. And maybe also allow "linking" them to the primary data.

(3D labelmap, 4D labelmap, 4D fractional labelmap; and - primarily for 3D display - closed surface, planar contours, and ribbons; see overview here)

That's a very nice overview! I think 3d labelmaps are already covered by the current spec and 4d could be achieved using the "c" dimension (which is the initial topic of this issue). I assume that "fractional" labelmaps would correspond to the probabilistic prediction case (see above).
For surfaces and contours, the most relevant discussion is #33.

@0x00b1
Copy link

0x00b1 commented May 22, 2021

@constantinpape I have not followed this (or any other ngff) discussion until yesterday! I apologize for missing some important context. 😄

I agree that being able to represent probabilistic predictions is important. But I would see this in a different category than the labels discussed here; for many downstream analysis tasks having a "regular" label map will be prerequisite. For now, probability maps can be stored following the "normal" NGFF data definition. We could think about some additional metadata for it. And maybe also allow "linking" them to the primary data.

My probabilistic example was just one example of overlapping labels. Overlapping visible and occluded regions is another.

@0x00b1
Copy link

0x00b1 commented May 22, 2021

Trends can change quickly though, so I agree that the file format should be able to handle fractional labelmaps well.

@lassoan Exactly. argmax predictions are, and I assume will remain, extremely common! Hell, they are preferred in countless situations. As far as trends are concerned, every method on the Cityscapes and Common Objects in Context leaderboards outputs (objects, y, x) masks! Nevertheless, I realize that I may not be the target audience for ngff! 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants