Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the axes fields in multiscales metadata #57

Merged
merged 18 commits into from
Jan 27, 2022

Conversation

constantinpape
Copy link
Contributor

@constantinpape constantinpape commented Sep 7, 2021

Proposal summary

This proposal adds "axes" metadata to describe the axes / dimensions of a physical coordinate data space and "transformation" metadata.
Both axes and transformation metadata are then used in "multiscales" to describe the physical coordinate system of the data and the mapping from data coordinates to physical coordinates (for each scale).

Note that we only introduce simple "scale" and "translation" transformations here, but intend to expand the set of supported transformations in the future. See #67 for initial discussions.

The main motivation of the proposed changes is to enable expressing "simple" spatial coordinates for a multiscale image that can be interpreted by all viewers (i.e. also viewers without support for more complex transformations like standard Fiji).
As a second design goal, the transformation spec outlined here should be extensible to more complex transformations (see again discussions in #67) to be able to later express these in the same spec.

Some example data for ome.zarr files using this proposal are available here: https://github.com/ome/ome-ngff-prototypes.

@constantinpape constantinpape marked this pull request as draft September 7, 2021 13:53
@constantinpape
Copy link
Contributor Author

constantinpape commented Oct 1, 2021

First draft for separate axes metadata

Adapted from https://github.com/saalfeldlab/n5-ij/wiki/Transformation-spec-proposal-for-NGFF#proposal-a.
Find the design choice and my reasoning behind it below; comments / critique are welcome.
There are still some open questions listed as well where I'd be very happy about some input.

Design choice

Axes metadata describes the physical coordinate space (= output coordinate space after transformation). It is defined separate from multiscales to make it easier to refer to it and reuse in other spec definitions if this becomes necessary.
For similar reasons the transformation spec will also be defined separately, see #63 (WIP).
This design is extensible (new axes_types or transformations can be added) and easily allows lifting the only up-to-5d restriction (these were 2 of the most important points of discussion in the issue / last ngff meeting).

Open Questions

Units: Do we restrict the units or is this free text? If we choose to restrict, what set of units is allowed? Some options that were brought up in #35: SI-Units, https://people.csail.mit.edu/jaffer/MIXF/CMIXF-12.

Level of requirement: E.g., must there be units or is it should? Also, do we allow for types that are not in "space", "time", "channel" (right now it's SHOULD, so we do allow other values).

Axes order: Axes names are not fixed any more, so we can't enforce the order tczyx. But we may consider adding a MUST or SHOULD that the spatial axes come last.

Misc

  • example jsons are not updated yet; I will update them once there is a consensus to adopt this proposal.
  • the new axes metadata breaks compatibility with v0.3, which had a single list in axes. I think having axes as a separate object is worth having that change.

latest/index.bs Outdated Show resolved Hide resolved
@will-moore
Copy link
Member

Are we still restricting the number of dimensions to 5?
Do we still care about the order of dimensions. Currently we have All image arrays must be up to 5-dimensional with dimension order (t, c, z, y, x).
Certainly a fair bit of code (vizarr, ome-zarr-py etc) would struggle to support x and y dimensions being not last.

NB: there's a few other places in the example JSON sections that need to be updated.

@constantinpape
Copy link
Contributor Author

Are we still restricting the number of dimensions to 5?

Yes, I also made this more explicit in b3523c9.

Certainly a fair bit of code (vizarr, ome-zarr-py etc) would struggle to support x and y dimensions being not last.

Indeed, we should clarify this more. We can't rely on fixed axes names "xyz" any more (and the demand for this was very clear in the last meeting). But we can add a MUST or SHOULD that the spatial axes should be last.

NB: there's a few other places in the example JSON sections that need to be updated.

Yes, I will wait with updating until there is a consensus to adopt this proposal and we settled some of the open questions.

@satra
Copy link

satra commented Oct 1, 2021

i think units are a must especially once transforms and coordinate systems and other image spaces come into the picture.

btw, there is a bit of discussion around metadata of transforms in this document (https://docs.google.com/document/d/11gCzXOPUbYyuQx8fErtMO9tnOKC3kTWiL9axWkkILNE/edit) which is part extending the bids-specification.

regarding the actual transformations themselves, itk has a set of good transform representations (i'm sure @thewtex can say a whole lot more :) )

@unidesigner
Copy link

Thanks @constantinpape for following-up on this. Would it be useful to extract a list of questions from the neuroglancer discussion that may be relevant to discuss here?

Regarding the question about restricting the number of dimension to be maximally 5, and your reply making this more explicit in the docs: Is there really a consensus to do this? Also regarding enforcing explicitly for the last two dimensions to be spatial? I don't think that is generally useful, and would possible limit what can be done in the future with ome-zarr files. Tools that do have specific requirements on the contained data (e.g. requiring at least two or three spatial dimensions, or with more than 5 dimensions etc.) could just fail with an error message in those cases.

@constantinpape
Copy link
Contributor Author

Would it be useful to extract a list of questions from the neuroglancer discussion that may be relevant to discuss here?

Yes, that would be helpful.

Regarding the question about restricting the number of dimension to be maximally 5, and your reply making this more explicit in the docs: Is there really a consensus to do this?

I think there is still quite some discussions going on about this restrictions, with probably more support to lift this requirement. However, I would like to avoid opening up this topic for the next release to focus on the axis metadata format. I think lifting the requirement would be very easy in the current proposal (but let me know if you see any potential issues). Same for the axis order discussion.

@unidesigner
Copy link

Okey. Is there a timeline for when the next release is planned?

If the axes metadata is expressed as lists is a good start to express more or less dimensions. As already discussed a bit, one has to be careful by implying to much from a label name, i.e. assuming too much implicitly, e.g. when writing x. As this also very closely relates to the transformation specs, one has to consider what to do, and how to specify, what to do with the non-spatial dimensions and transformations (e.g. imagine a temporal dimension being time-shifted), and not to constrain them to a fixed number of dimensions (e.g.@jbms suggestions of using permuted block-diagonal matrix to specify the affine in the issue. I think it will be crucial to be able to define identifiers for different possible spaces, and use those identifiers in the transformation specification (what is the source space? what is the target space?), so that it becomes possible to make a valid interpretation of what e.g. an x axes stands for in any given (transformed) space. This may also already solve some of the axes (re)ordering discussion - imagine specifying a "identity" permutation matrix transformation (e.g. in the format that is used to specify affines), which basically just reorders the axes.

@jbms
Copy link

jbms commented Oct 1, 2021

While in common cases it seems to make sense to label full coordinate spaces, e.g. where you have a common xyz coordinate space to which multiple volumes are aligned, in general I think it actually makes more sense to label individual dimensions rather than the full coordinate space, and that could be accomplished just by using longer dimension names, and possibly having some way to abbreviate them in viewers. For example:

fafb.v14.x
fafb.v14.y
fafb.v14.z

For example, you might have a combined EM/functional dataset consisting:

EM alignment v2 with dimensions:
dataset.v2.x
dataset.v2.y
dataset.v2.z

Multi-channel functional recording experiment 1:

dataset.v2.x
dataset.v2.y
dataset.v2.z
dataset.v2.exp1.t
dataset.v2.c

Multi-channel functional recording experiment 2:

dataset.v2.x
dataset.v2.y
dataset.v2.z
dataset.v2.exp2.t
dataset.v2.c

By labeling dimensions we can fully represent the relationship between the three volumes. If we labeled entire coordinate spaces, we would have to assign a separate coordinate space label to each volume.

@bogovicj
Copy link
Contributor

bogovicj commented Oct 1, 2021

@unidesigner ( Hi Stephan!)

one has to be careful by implying to much from a label name

yes, agreed. Label names are possibly arbitrary, but possibly not. They should be allowed to carry information (I quite like @jbms 's example), but not required to.

I think it will be crucial to be able to define identifiers for different possible spaces, and use those identifiers in the transformation specification (what is the source space? what is the target space?), so that it becomes possible to make a valid interpretation of what e.g. an x axes stands for in any given (transformed) space.

This is essentially what we describe in our transformation "proposal - D": giving names to input and output axes, and specifying very explicitly which input axes a given transform applies to and what output axes it generates. It's quite cumbersome though, and likely "going overboard".

@constantinpape
Copy link
Contributor Author

constantinpape commented Oct 1, 2021

@unidesigner

Okey. Is there a timeline for when the next release is planned?

There are two steps for this: getting this PR and #63 merged to incorporate transformations (of course including discussions on what still needs to change with the proposals). Then producing example data and making sure it runs in the reference implementations (currently Fiji via MoBIE, vizarr and napari via the napari-ome-zarr plugin). I can't give you a clear timetable on the first point, because this very much depends on how much discussion and changes are needed. Updating the reference implementations should be relatively fast and be feasible in ~ 1 week (depending of course on the availability of developers / contributors of the individual tools).

As already discussed a bit, one has to be careful by implying to much from a label name, i.e. assuming too much implicitly, e.g. when writing x

In the current proposal I introduced 3 lists in the axes objects: label (which specifies the name and is currently arbitrary), type (which specifies the axis type and should be one of space, time, channel (note the should, so custom types are allowed but discouraged)).

As this also very closely relates to the transformation specs

For the transformations please refer to #63 and have a look at the current proposal and how it relates to what you have written below. (Note that this is based on https://github.com/saalfeldlab/n5-ij/wiki/Transformation-spec-proposal-for-NGFF and inspired by Proposal C but I tried to simplify the proposal a bit; the major differences are listed on top of the PR).

@jbms the current draft allows arbitrary labels and has an additional field types to specify the type of the axis.
So introducing more complex naming patterns is certainly possible but not required (and I am not sure aiming for mandatory naming patterns is a good idea at this point; but one could certainly discuss naming conventions, this would merit a separate discussion in another issue though.)

@bogovicj
Copy link
Contributor

bogovicj commented Oct 1, 2021

My opinion on open questions

Units: Do we restrict the units or is this free text?

units SHOULD be

  • SI or SI-derived (e.g. "radian")
  • or some indicator for arbitrary units ("null", "N/A", "none", "pixel", or similar)

but any string is technically allowed.

Also, do we allow for types that are not in "space", "time", "channel" (right now it's SHOULD, so we do allow other values).

Agree with SHOULD, I'll probably only ever use these tree, but others should be allowed ( e.g. "polarization angle" )

E.g., must there be units or is it should

SHOULD. if units are empty "arbitrary" / "pixel" units are implied

@jbms
Copy link

jbms commented Oct 1, 2021

I noticed the units specified in the proposal all have a multiplier of 1, i.e. "nm" or "ms". But what about other multipliers, like "4nm"? Is the intention that you always have to specify units with a multiplier of 1, and rely on an affine transformation to transform your data to those units?

In many cases I expect there will be no affine transform, and you just want to indicate the actual voxel size. Having to extract that from an affine transform seems undesirable.

@bogovicj
Copy link
Contributor

bogovicj commented Oct 1, 2021

@jbms
You could specify voxel size in the way you want with the scale transform in the transform spec:
https://github.com/constantinpape/ngff/blob/trafo-spec/latest/index.bs#L231

this is equivalent to giving the diagonal elements of the affine only. does that work for you?

@jbms
Copy link

jbms commented Oct 1, 2021

I think it may still be desirable to be able to specify the "native" resolution of the data in many cases. For example, Neuroglancer chooses the units (including multiplier) of the global coordinate space by default to match the native resolution of the data. The units/multiplier of the global coordinate space affect how coordinates are displayed, the step size for discrete movement commands, and also have implications for numerical precision. If there is just an affine transform, we could attempt to heuristically extract the units and multiplier from that, but that seems to leave more ambiguity than if the units are specified explicitly.

In general, it seems that if you allow SI prefixes, which are just another form of multiplier, albeit one that is nicer to display in a scale bar, then it would be logical to also allow arbitrary multipliers.

@d-v-b
Copy link
Contributor

d-v-b commented Oct 1, 2021

The way I think of it (and this is highly influenced by my exposure to xarray / netcdf), unit defines a property of an axis, where an axis is a vector of values, and therefor the unit property is orthogonal to the contents of that axis vector. In this model, unit: nm doesn't imply a native resolution of 1 nm -- it just implies that the elements of the axis are to be interpreted as positions in nanometers, which is a pretty clean abstraction, insofar as it ultimately allows irregularly sampled data.

Since we don't have axes explicitly populated with values in metadata, the transform property is needed to define the sampling grid of the data. So from this perspective it's natural to infer the "native" resolution from the transform and not from axis labels. Plus, "4nm" is not a unit :)

@jbms
Copy link

jbms commented Oct 1, 2021

You have a good point that by using a coordinate array you gain an additional level of indirection and can use it to specify multipliers. But I would imagine that for regularly spaced data a coordinate array would not normally be specified.

According to the udunits2 library, "4nm" is a unit:
https://www.unidata.ucar.edu/software/udunits/udunits-2.2.28/udunits2lib.html#Syntax

In general while aesthetically "nanometer" is a nicer unit to work with than "4 nanometer", I don't see why fundamentally there should be support for "1e-9 meter" as a unit but not for "4e-9 meter" as a unit.

If the affine transform is not a permutation of a diagonal matrix, then we can't infer scale factors from it, so we would be back to just using whatever units are specified explicitly. For example, maybe you have aligned functional data to an EM volume, and there is an affine transform involved, but you want to specify the EM native resolution as the explicit units.

It sounds like the spec is merely suggesting a syntax for units, and so "4nm" could still be used, but I think it would help to specify how these units are expected to be used.

@bogovicj
Copy link
Contributor

bogovicj commented Oct 1, 2021

There are two related but separate (in my mind) issues that I feel we're mixing up a bit and will try to dis-entangle, specifically how do we distinguish between:

  1. transformations from the discrete-grid to a physical space and
  2. transformations from physical-to-physical space or image-to-image transformations.

relating to the conversation above

Here's how I see these mapping in @jbms comment above - please correct / confirm Jeremy.

I think it may still be desirable to be able to specify the "native" resolution of the data in many cases.

this is (1), and I agree.

For example, maybe you have aligned functional data to an EM volume, and there is an affine transform involved, ...

this refers to (2)

... but you want to specify the EM native resolution as the explicit units.

and this is (1), and I agree.

is (2) in the spec?

As in Jeremy's example, say I have an EM dataset, and I specify the (1) transform "this image has resolution 4 x 4 x 40nm" (S) It's aligned to the functional dataset, so I spec the (2) transform affine A (EM -> functional).

My viewer now needs to know what space I want my EM to be displayed in before it can do the right thing.
If I want to look in native EM "nm" space, I need the transform S only, but if I'm aligning with functional I need
AS

Either, the spec covers (1) only, or needs a much more concrete sense of coordinate space / axes as above to be unambiguous - something more like our proposal D.
Are we tackling this now?

a worry

I worry a little about having different "ways" of going from pixel to physical space. Someone soon is going to label axes with 4nm x 4nm x 40nm "units" and also have a a transform that applies scaling of 4,4,40 to the axes. I get it, that they're likely saying the same thing in two different ways, it's just not so nice.

I personally would prefer to almost always only use a scale transform (maybe with an offset) to describe the physical extents of a dataset independent of all others and store transforms of type (2) somewhere that also explicitly says that this transform takes image "moving" to image "target".

@jbms
Copy link

jbms commented Oct 1, 2021

In regards to your worry, I would say that if you specify that the units of the output space are "4nm", then your transform had better be designed to output coordinates in units of "4nm", not units of "nm", i.e. all of the coefficients in that row of the affine transform should be divided by 4 compared to if the output space is "nm".

That is the same issue that comes up with having an output unit of "um", and a scale factor of 1e-3, vs having an output unit of "nm" and a scale factor of 1.

@aeisenbarth
Copy link

  • Should labels be renamed to names to avoid varying usage of the term (compared to labels images)?

@constantinpape constantinpape mentioned this pull request Oct 4, 2021
@constantinpape
Copy link
Contributor Author

constantinpape commented Oct 4, 2021

Thanks for all the discussions and feedback.

@bogovicj

My opinion on open questions [...]

I very much agree with your assessments, esp. SHOULD for the units (relevant for all the following discussions).
This means we need to define the recommended unit names. As far as I can see there are 2 options:

@bogovicj @d-v-b @jbms

I personally would also prefer "clean" units, i.e. only having "nanometer" or "nm" in the recommended unit names and not "4 nm".
(But if we keep it at SHOULD, specifying "4nm" is still valid, but not encouraged.)

Regarding the distinction 1) use transformations to specify discrete to physical space vs 2) to specify physical to physical space:
I think that the transformations stored in "multiscales" should only be used for 1) (which also makes it feasible to extract the pixel spacing). For 2), we would then use the collection spec #31, where we would reuse the transformation spec from #63.
Note that the collection spec is still WIP, but we have started some discussions on this with the OME-Team, @tischi, @tpietzsch and @StephanPreibisch. We have a follow up meeting scheduled and will make a better job to announce this and make a summary of the meeting available afterwards. (Also note that this separation would only be best practice / recommended, i.e. it would still be possible, but discouraged, to define a complex transformation in "multiscales".)

@aeisenbarth

I agree that names might be better here.

@jbms
Copy link

jbms commented Jan 25, 2022

@axtimwalde Under the current proposal, I guess it would also be necessary to copy the transformation metadata that was written to volume B to the segmentation volume as well.

To me a key question is how often we want to say "when using volume B, always use this transformation" as opposed to having some way of storing "here is a transformation from volume A coordinate space to volume B coordinate space" and then creating some sort of "view" of volume B with the transformation applied.

@axtimwalde
Copy link

Great points that also apply to Story 1, 2, and 3, I do not see a clear boundary between these scenarios when it comes to the transformation spec and would therefore keep them as compatible as possible. In my understanding all transformed volumes are views. The storage location for that displacement field can be an absolute URL and does not have to exist in the same container as volume A and B, so it can be re-used. It is also likely that volume A is in a different container than volume B. We may want to add a more every-day story than the BigWarp scenario for non-rigid transformations like ultra-sound scanners or radars?

@jbms
Copy link

jbms commented Jan 25, 2022

Yeah, to be clear I wasn't really drawing a distinction between affine vs non-affine transformations here, but rather just getting at the idea that the transformation could be decoupled from the volume itself.

In my view Story 2 is a little different because there the cutout location is known upon creation and the "transformation" isn't going to change.

For an affine or non-rigid transform you may still wish to work with the untransformed volume for various reasons, and there may be more than one possible transform.

Additionally, the original volume may already be referenced elsewhere before you have computed the affine or non-rigid transformation, and in that case you don't necessarily want to change the meaning of those existing references by modifying its metadata.

@axtimwalde
Copy link

Yes, transformations can be decoupled from the volume. But I also think that there is not clear difference between a cutout offset or pixel scaling factor and a registration transformation. E.g. imagine an SEM volume stored as a 3D dataset in ome-zarr format. The microscope will most likely export data such as the nominal pixel spacing in x and y as well as a nominal spacing in z. These values can be used to display a view of the array data, but we also know that this information is not physically correct and therefore have to perform alignment and z-thickness compensation, the combination of the nominal values and the alignment transformations can now be used to display another view of the array data. Mapping the result into a canonical atlas space is yet another view of the data. Where do those transformations go? Depends on the application. Since we can store arbitrary meta-data in zarr/N5/HDF5 and friends, it is perfectly fine to for that information to be associated with a volume, or with a multi-scale group, or somewhere else. Were you suggesting to wrap all transformation meta-data into named view-tags? I find that appealing. It brings us back to groups (sequence) of transformations (@bogovicj) had this in our original proposal. Such groups could be used by named references to create compositions and subsets of transformations. In the above FIB-SEM scenario, we would most likely have the groups "microscope", "alignment", "atlas" that can be composed at will. I find this very appealing.

@satra
Copy link

satra commented Jan 26, 2022

Story 5. slightly different registration problem.

  1. User collects multiple overlapping/non-overlapping slices and stores each in ome-zarr format with appropriate transformations
  2. User collates all slices into a single ome-zarr volume (registering neighboring slices, joining the data, or introducing zeros/nans as necessary).
  3. [Potential] User may want to simply save the transformations necessary to create the single volume on the fly.

@jbms
Copy link

jbms commented Jan 26, 2022

I agree that the pixel size from the microscope may be unreliable and in that sense is similar to a registration transformation. But the cutout offsets would most naturally be expressed in units of voxels (zarr-developers/zarr-specs#122) and then would be exact and immutable and independent of any other transformations.

As far as other transformations, though, I don't have a concrete proposal for exactly how it might work, but what I'm suggesting is to be able to specify named transformation objects independent of any particular volume, and optionally identify the source and target coordinate spaces with some string identifier. And then similarly define named "view" objects that reference a transformation, or sequence of transformations, and an array or multiscale array. Array or multiscale arrays could also optionally specify a named coordinate space, and then viewers could potentially automatically suggest compatible transformations that could be applied.

I think it could be useful to distinguish between "arrays" that are indexed by discrete coordinates and "views" that are indexed by continuous coordinates. Otherwise if we reference a given "multiscale" or "array" that has transformations applied, we would have to additionally specify if we are referring to the original discrete coordinate space or the transformed continuous coordinate space.

@tischi
Copy link

tischi commented Jan 26, 2022

As far as other transformations, though, I don't have a concrete proposal for exactly how it might work, but what I'm suggesting is to be able to specify named transformation objects independent of any particular volume, and optionally identify the source and target coordinate spaces with some string identifier. And then similarly define named "view" objects that reference a transformation, or sequence of transformations, and an array or multiscale array.

This sounds to me very similar to what we currently do in MoBIE. Here is an example of a view of many images where in addition to display settings we also specify the transformations for that particular view. Here are the current specs.

So, in that sense the transformations are decoupled from the volume itself and one can have different views on the same volume with different transformations. However, we currently require that the transformations in the views are from physical space to physical space. That means that currently the "volume" must contain one immutable information of how to get from data space (voxels) to physical space (x,y,z). We thought that is convenient because (i) one could just look at a volume without having to specify a particular view and (ii) it is easier to just work in physical space in the views without having to worry about the more complex issue of how to get from data space to physical space. But I guess a concept of a "default view" and "views of views" could achieve something similar.

One other question: I am confused about the concept of a voxel size. Already when one rotates the data a little bit the voxel grid does not align anymore with the physical coordinate system and I am confused how to define voxel size in such cases?!

@aeisenbarth
Copy link

aeisenbarth commented Jan 26, 2022

The concept of views could also help for what we are doing:

We have a workflow doing segmentations and registrations, and during this process we incrementally update the Zarr metadata to store the transformations. We also may rerun the workflow with different sets of parameters, leading to versioned results that we want to be able to view and compare.

If we can only have a single transformation strongly coupled with an image, we need to create a copy of the Zarr for not overwriting the original. With views we would be able to define a view for each set of parameters and tell which transformations belong to that view and which segmentations.

@bogovicj
Copy link
Contributor

bogovicj commented Jan 26, 2022

Happy to see this conversation and the interest in views.

@jbms

I'm suggesting is to be able to specify named transformation objects independent of any particular volume, and optionally identify the source and target coordinate spaces with some string identifier.

Yes, I want this too. and so does @axtimwalde wro write:

In the above FIB-SEM scenario, we would most likely have the groups "microscope", "alignment", "atlas" that can be composed at will. I find this very appealing.

I'll wrote something up awhile ago, and I'll flesh out for tomorrow and would love to discuss.

@sbesson
Copy link
Member

sbesson commented Jan 26, 2022

My vote also goes towards a simplified version of the scale transforms as expressed in #57 (comment) (option 1). As this seems to be the remaining point of contention in this PR, this would help moving towards a releasable state of the specification. The discussion around affine transformations can continue separately and lead to the reintroduction of the axisIndices concept as needed.

@constantinpape
Copy link
Contributor Author

constantinpape commented Jan 26, 2022

A quick wrap up before the meeting tomorrow (summary from a quick pre-meeting with @sbesson, @will-moore, @joshmoore):

  • strategy for 0.4: merge this PR as is (just with some changes to address detail review comments); propose changes to simplify scale and offset to dicts in a separate PR; once that one is merged we can release 0.4. We will discuss this tomorrow and go ahead with this plan to make the release in the next few days unless there is any major veto.
  • it's great to see the discussions and examples for more complex transforms and their application. We decided a while back not to include this in 0.4 to reduce the complexity for the 0.4 release. But this will be very helpful for extending the transform spec and I will make sure that we transfer the "user stories" to a separate issue when merging this PR so that we don't loose sight of this. A few comments w.r.t. extending the transform spec:
    • transformations is introduced as a separate object in this PR so that it's possible to refer to it not only in the multiscales spec but also later when collections or views are introduced, along the lines of many ideas brought up in the user stories.
    • once we introduce affine or more complex transforms we will need a mechanism to label the subset of dimensions that the transforms are applied to. axisIndices were the initial approach for this; but for the current scale and offset transforms this approach is an over-engieneered and using a dict is much simpler. So we will instead need to solve this problem once affine transforms are introduced and can either resort to axisIndices then or find a more elegant solution. (RIght now I think that using the axis names instead of numerical indices would be better; but we should not open this discussion here).
  • we have also scheduled time tomorrow for discussions on the transformation spec and how to follow up on this

@bogovicj
Copy link
Contributor

@constantinpape

A quick point to think about too (we'll discuss in the call today, but wanted to write it down here next to everything else).

The spec allows for binary storage the transform data at some path in the container. Using the dict to specify axes doesn't work in that case. So we need either to

  • specify the axes those scale / translation refer to in a different way
  • require that the length of scale / translation match the dimensionality of the array

@constantinpape
Copy link
Contributor Author

Good point @bogovicj. I forgot that we allow binary data here already. Let's discuss during the meeting.

@constantinpape
Copy link
Contributor Author

I will merge this PR now and finalize the axes and initial transformations in a separate PR:

  • changes for new PR
    • remove axisIndices, require that scale and translation have the same length as the axes (we discussed this in the meeting; the transformations on a subset of axes will be introduced in the next proposal on advanced transformations by @bogovicj)
    • rename transformations -> coordinateTransformations to clarify the scope
  • other points:
    • move user stories to separate issue (I am on it)
    • address review comment by @will-moore about order of scale and translation (@bogovicj volunteered to write something up on this; let's do this in the new PR)
    • address comment by @jbms to give mathematical definition for the transformation

@constantinpape constantinpape merged commit 90de9f1 into ome:main Jan 27, 2022
@constantinpape constantinpape deleted the axes-spec branch January 27, 2022 20:25
github-actions bot added a commit that referenced this pull request Jan 27, 2022
Extend the axes fields in multiscales metadata

SHA: 90de9f1
Reason: push, by @constantinpape

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@constantinpape
Copy link
Contributor Author

Final house-keeping:

A big thank you to everyone who commented here and helped to bring this into shape!

github-actions bot added a commit to sbesson/ngff that referenced this pull request Jan 28, 2022
Extend the axes fields in multiscales metadata

SHA: 90de9f1
Reason: push, by @sbesson

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to bogovicj/ngff that referenced this pull request Feb 1, 2022
Extend the axes fields in multiscales metadata

SHA: 90de9f1
Reason: push, by @bogovicj

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@sbesson sbesson added this to the 0.4 milestone Feb 1, 2022
melissalinkert added a commit to melissalinkert/bioformats2raw that referenced this pull request Feb 8, 2022
@imagesc-bot
Copy link

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-2022-01-27/60885/11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.