[Draft] Add transformation spec #63

constantinpape · 2021-10-01T08:40:56Z

UPDATE

I have ported most of the content from here into #57, except for the more "complex" transformations (affines, defeormation fields, position fields). The next version (0.4) will only support the simpler transformations and we will add the more complex transformations in a later version.

For discussions about v0.4, please refer to #57; feel free to keep discussing affines or deformation fields in this draft.

constantinpape · 2021-10-01T12:44:56Z

First draft for separate transformation metadata

Adapted from https://github.com/saalfeldlab/n5-ij/wiki/Transformation-spec-proposal-for-NGFF, mixing parts of Proposal A and Proposal C.
Design choice and my reasoning behind it are below; comments / critique are welcome.

Design Choice

Transformations are specified for each scale array in multiscales:datasets. They are not repeated in the "<SCALE_ARAAY>/.zmetadata" (same is true for axes in #57). Note that there have been discussions about duplicating or moving the transformation and axes metadata there. I would like to postpone this discussion and use #57 and this PR to define the metadata format first.

Differences to the proposal by John and Stephan

I have deviated from the proposal in several places, either for simplicity or because I did not understand some part of the proposal. Of course this reflects my subjective preferences and is very much up to discussion:

I have left out sequence, instead the transformations are specified in a list in multiscales:dataset.
For translation, scale and affine I interpret oneOf(path, paramField) as exact (i.e. exactly one of those must be specified).
For d_field and p_field I only allow internal paths for storing the fields to keep the spec simpler and make the container stand-alone.
For d_field and p_field I leave out the parameter interleaved, as I don't its purpose (also not mentioned in the description in the proposal).
I have left out inverse_of, because I am not sure about the exact implementation. Would this only be valid for invertible transformations? How do we know if they are invertible? E.g. inverse_of({"type": "d_field", "path": "/my_deformations"}).
I have left out outputIndexes from Proposal C ("inputIndexes" correspond to "axisIndices") for the sake of simplicity. I.e. axes reordering is currently not possible. (I am not sure how we would do this in a software-interoperable way.)

latest/index.bs

thewtex · 2021-10-15T17:22:36Z

latest/index.bs

+Describes axes of a physical coordinate space. It is a dictionary, which MUST contain the fields:
+- "labels": list of strings that specify the name per dimension. The values MUST be unique.
+- "types": list of strings that specify the type per dimension. The values SHOULD be one of "space", "channel", "time".
+- "units": list of strings that specify the unit per dimension.


To avoid very troublesome ambiguities that result from missing information related to the direction of the physical space to physical space transformations, a SHOULD could be added here with { "from": "<string-identifier>", "to": "<string-identifier>" }. Where, for example, <string-identifier> is the name of an image or the name of an atlas space.

Very good idea.

Our idea in the original proposal was that all transformations are forward. It makes no sense to have a sequence of transformations where the from space of the second transformation is not the to space of the first, and if that's problematic, I do not see the point.

We also elaborated why the inverse_of transformation is a great idea: Images are usually transformed by iterating over target pixels, running their coordinates through the inverse of the transformation (which includes running it in inverse order through the inverses of the transformations of a sequence) and then interpolating in source space. This works without hassle for all transformations that are invertible (like e.g. most affines) but not for transformations that are not invertible (like displacement fields). For invertible displacement fields, the easiest solution is to provide both as in @bogovicj's atlas transformations. If only the inverse is required, because everything the transformation is to be used for is rendering transformed images, then it can be specified as inverse_of a non-invertible transformation such as e.g. a deformation field.

Sorry for not explaining the interleaved field. We falsely assumed that this is self-explaining :). An interleaved d/p_field stores the x,y,z target coordinates of the transformed x,y,z source coordinate contiguously, the non-interleaved d/p_field stores first the x-coordinates, then the y-coordinates, then the z-coordinates. (I am explaining this by x,y,z example because opinions about dimension-indices are unresolved).

thewtex · 2021-10-15T17:55:59Z

latest/index.bs

+| `identity`    |        | identity transformation, is the default transformation and is typically not explicitly defined |
+| `translation` | one of: `"translation":List[float]`, `"path":str` | translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `scale`       | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `affine`      | one of: `"affine":List[float]`, `"path:str"` | affine transformation matrix defined as list consisting of n sets of n + 1 scalar numbers, stored either as a list of floats (`affine`) or as binary data at a location in this container (`path`). n is number of dimensions |


It would be helpful to have an equation that illustrates the affine transformation and describes whether these lists are row-major, or column-major. Also, is it n + 1 or just n?

Yes, we definitely need more examples and a clear definition here; I was a bit hesitant for the affines, because I copied this from https://github.com/saalfeldlab/n5-ij/wiki/Transformation-spec-proposal-for-NGFF, which (I think) is column major, but row major would be more appropriate for the more pythonic zarr spec. I will think about this, but I am currently leaning towards postponing the introduction of affines (I will elaborate below).

I wish you great luck doing math with row major matrices in essentially all existing geometry frameworks ;). Joke aside, I had that same discussion with @d-v-b who is strongly pro row major vectors and we sort of agree that this makes no sense in the context of essentially all APIs that do geometry (unlike the pythonic zarr spec which doesn't) like OpenCV, OpenGL, ITK, ImgLib2, ... so please let's not do it here, it's a bad idea.

who is strongly pro row major vectors

I am not "pro" any specific kind of vectors :) I just want data like this to be consistent and explicit. Probably requiring a row- or column-major flag for vectors / matrices is the safest approach?

Well, let's please not confuse array indexing with geometry, it's bad enough that this creeps in with integer offsets and size vectors and other half-way in between stuff. It's ok to index elements in a matrix in y,x order, but I find it intolerably confusing to also re-sort the matrix elements. I also do not know a single textbook or geometry API that puts the last dimension first. This is certainly possible to deal with but will comsume pages of documentation and confuse generations of developers. I strongly suggest to stick with the x,y,z order for the elements of a matrix and vectors of transformations. Very opinionated here :).

Thanks for the comments here :). I see the argument for sticking with the x,y,z convention re geometry...
I think it may be helpful to gather some actual example data for this and see how it would work if we have x,y,z conventions in the transformations but allow for z,y,x data layout.
I can try to add something once #57 is more advanced in https://github.com/ome/ome-ngff-prototypes.
Feel free to propose some examples there as well.

thewtex · 2021-10-15T18:09:31Z

latest/index.bs

+| `translation` | one of: `"translation":List[float]`, `"path":str` | translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `scale`       | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `affine`      | one of: `"affine":List[float]`, `"path:str"` | affine transformation matrix defined as list consisting of n sets of n + 1 scalar numbers, stored either as a list of floats (`affine`) or as binary data at a location in this container (`path`). n is number of dimensions |
+| `d_field`     | `"path:str"` | deformation / displacement field storing one offset for each grid coordinate, `path` points to the array in this container that stores the field |


To be more explicit and readable without having to refer to the spec, can we use displacement_field or displacementField?

In general, the spatial domain of a transform does not necessarily match the spatial domain of the image it transforms. Also, a transformation can be applied to point sets, etc. re: #14 . And, a displacement field is generally oversampled if it is sampled at the same resolution of the image it transforms, so there is a lot of wasted space when dealing with large images. So, we should associate the d2p transformations discussed in #57 to specify the offset / translation and spacing / scale of the displacement field.

this and the next key should also consider whether path:str allows for multiple references (local file, absolute path, url, relative url) or forces the change of metadata when moving data between systems.

To be more explicit and readable without having to refer to the spec, can we use displacement_field or displacementField?

In general, the spatial domain of a transform does not necessarily match the spatial domain of the image it transforms. Also, a transformation can be applied to point sets, etc. re: #14 . And, a displacement field is generally oversampled if it is sampled at the same resolution of the image it transforms, so there is a lot of wasted space when dealing with large images. So, we should associate the d2p transformations discussed in #57 to specify the offset / translation and spacing / scale of the displacement field.

Very good points. We should absolutely choose a more explicit name and specify how to deal with over-sampling.

this and the next key should also consider whether path:str allows for multiple references (local file, absolute path, url, relative url) or forces the change of metadata when moving data between systems.

for simplicity I would like to keep this in the same container in the first iteration; do you have some use-cases in mind where this would not work / require copying of too much data?

See also general comment on trafo spec below.

for simplicity I would like to keep this in the same container in the first iteration; do you have some use-cases in mind where this would not work / require copying of too much data?

since these are computed transforms, these are more likely to change as algorithms improve, and these transforms take up a lot of space. when you are versioning the data (even as a tree), this would thus lead to copies of the source data in many cases. since changes in software are more rapid that changes in source data, decoupling the transform to a different store/file/url would help. more generally i don't think the zarr world handles changes/versioning of datasets yet. there are some threads of discussion going on there.

since these are computed transforms, these are more likely to change as algorithms improve, and these transforms take up a lot of space. when you are versioning the data (even as a tree), this would thus lead to copies of the source data in many cases. since changes in software are more rapid that changes in source data, decoupling the transform to a different store/file/url would help. more generally i don't think the zarr world handles changes/versioning of datasets yet. there are some threads of discussion going on there.

I see and that makes sense. I am still a bit hesitant because external links could complicate the spec a lot. But maybe it would be a good idea to say SHOULD be an internal path but MAY also be an external path to accommodate development use cases. Then we would need some syntax for external paths though.

Can you explain why a reference to an external container complicates the spec? There are two main ways to resolve this such that container relative and path relative references are easy:

Split the reference into two elements (container + path) with non-existing container defaulting to self, paths without leading / are relative.

Come up with a URL schema that separates containers and paths (e.g. scp style container:path which enables path for container relative urls and relative paths without leading /)

Can you explain why a reference to an external container complicates the spec?

Sorry, I was a bit imprecise here. I don't think that it would make the spec more complex, but it would have the added benefit of having all relevant data in the same container and not requiring any reference to external data (e.g. we also don't allow to reference to image data that is outside of the zarr container right now).
If there is a need for external paths I am fine with introducing it.

1. Split the reference into two elements (container + path) with non-existing container defaulting to self, paths without leading `/` are relative.

I would prefer this option.

thewtex · 2021-10-15T18:12:53Z

latest/index.bs

+| `scale`       | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `affine`      | one of: `"affine":List[float]`, `"path:str"` | affine transformation matrix defined as list consisting of n sets of n + 1 scalar numbers, stored either as a list of floats (`affine`) or as binary data at a location in this container (`path`). n is number of dimensions |
+| `d_field`     | `"path:str"` | deformation / displacement field storing one offset for each grid coordinate, `path` points to the array in this container that stores the field |
+| `p_field`     | `"path:str"` | position field storing one absolute position for each grid coordinate,  `path` points to the array in this container that stores the field |


Since this is overlapping with functionality from the displacement field, can it be left out to simplify implementations? In the future, we will hopefully support composite / chaining of transformations. The displacement field can be easily chained, where a position field is not as amenable to chaining.

Good point. I copied this from https://github.com/saalfeldlab/n5-ij/wiki/Transformation-spec-proposal-for-NGFF. @bogovicj is there a use-case where we would need a positional field instead of a displacement field?

p_fields are faster to apply becaue they can skip the addition with the source vector and don't even need to know it which can be helpful if you generate a transformed image via iteration over the target pixels. They are typically more expensive in storage because the value range is larger and requires higher precision.

satra · 2021-10-18T14:36:39Z

latest/index.bs

+| `identity`    |        | identity transformation, is the default transformation and is typically not explicitly defined |
+| `translation` | one of: `"translation":List[float]`, `"path":str` | translation vector, stored either as a list of floats (`"translation"`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `scale`       | one of: `"scale":List[float]`, `"path":str` | scale vector, stored either as a list of floats (`scale`) or as binary data at a location in this container (`path`). The length of vector defines number of dimensions. |
+| `affine`      | one of: `"affine":List[float]`, `"path:str"` | affine transformation matrix defined as list consisting of n sets of n + 1 scalar numbers, stored either as a list of floats (`affine`) or as binary data at a location in this container (`path`). n is number of dimensions |


i understand the explicitness of this. one thing to consider is whether just a dimensionality + 1 matched affine transform would be sufficient or the flexibility of having translation and scale is required. also does this mean one could include multiple transformations? and would a validator have to check that?

in the case where the data are tczyx would the expected dimensionality of the affine form be 6x6 ? in the bids microscopy spec, we had two fields: 1) transformation axis names and 2) transformation matrix. thus for 3d spatial data this could simply be a 3 element vector of axis names and a 4x4 matrix.

i understand the explicitness of this. one thing to consider is whether just a dimensionality + 1 matched affine transform would be sufficient or the flexibility of having translation and scale is required.

Given recent discussions (#57 (comment)) we will restrict the transformations available in the multiscale metadata (i.e. metadata for a single multiscale image pyramid) to identity, scale, rotation, translation. Following this discussion I would also rather wait with the introduction of affines and more complex transformations (see comment below). So yes, we do need the explicit transformations.

also does this mean one could include multiple transformations?

yes, the transformations are a list that are applied in order (this is also written clearly in the proposal)

and would a validator have to check that?

check what exactly?

in the case where the data are tczyx would the expected dimensionality of the affine form be 6x6 ? in the bids microscopy spec, we had two fields: 1) transformation axis names and 2) transformation matrix. thus for 3d spatial data this could simply be a 3 element vector of axis names and a 4x4 matrix.

The field axisIndices specifies which axes the transformation is applied to. (If not given the transformation is applied to all axes). This should be equivalent to the axis names (but using indices because we currently don't have named axes in the data space).
The transformation must be valid for the number of axes given in axisIndices (or the full dimensionality if axisIndices is not given).

As the spec is currently written, I would expect a 6D affine for your tczyx dataset,
or
for the spec to say explicitly

if your dataset is 5D and your affine is 3D, it applies to spatial dimensions

and or

if your dataset is 5D and your affine is 4D, it applies to spatial dimensions and the time dimensions

(but this relates to the versions of the spec where the axis orders are fixed).
We considered splitting transformations of different types in our Proposals C and D, which is nice in that it is explicit, but of course is more verbose.

more relevant though is @will-moore 's comment below on axisIndices / axis_indices

the check was my misinterpretation that the transforms were independent. perhaps this sentence

The transformations are applied sequentially and in order.

from multiscale could be brought up to this section on transforms. that must apply to everything, right?

thank you for the clarification of the other points.

for the spec to say explicitly

if your dataset is 5D and your affine is 3D, it applies to spatial dimensions

and or

if your dataset is 5D and your affine is 4D, it applies to spatial dimensions and the time dimensions

(but this relates to the versions of the spec where the axis orders are fixed). We considered splitting transformations of different types in our Proposals C and D, which is nice in that it is explicit, but of course is more verbose.

Fyi, this is still the case in the current spec proposal, just using the axisIndices mechanism, see also above.

the check was my misinterpretation that the transforms were independent. perhaps this sentence

The transformations are applied sequentially and in order.

from multiscale could be brought up to this section on transforms. that must apply to everything, right

Absolutely!

will-moore · 2021-10-18T15:09:27Z

latest/index.bs

+| `d_field`     | `"path:str"` | deformation / displacement field storing one offset for each grid coordinate, `path` points to the array in this container that stores the field |
+| `p_field`     | `"path:str"` | position field storing one absolute position for each grid coordinate,  `path` points to the array in this container that stores the field |
+
+In addition, the field "axisIndices" MAY be given to specify the subset of axes that the transformation is applied to, leaving other axes unchanged. If not given, the transformation is applied to all axes. The length of "axisIndices" MUST be equal to the dimensionality of the transformation. If "axisIndices" are not given, the dimensionality of the transformation MUST be equal to the number of dimensions of the array.


NB: Depending on the outcome of discussion at #54, this could be axis_indices.

constantinpape · 2021-10-18T15:15:06Z

Thanks for the comments @satra and @thewtex.
Given the discussions in #57 I will change plans slightly and introduce a simplified transformation definition including scale, rotation and translation in #57 directly and only introduce these in 0.4. (These are the transformations that we will immediately need in the multiscales metadata.)

I would then leave the introduction of more complex transformations (affine, displacementField) for one of the next upcoming versions (0.5 and 0.6). I think this will help to keep the discussion in #57 more focused and leave us more time to figure out the spec for the more complex transformations.

That being said, feel free to keep on discussing specs for affine / displacementField here to prepare for later version changes and I will go ahead to update #57 now.

imagesc-bot · 2021-12-09T14:08:54Z

This pull request has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/next-call-on-next-gen-bioimaging-data-tools-2022-01-27/60885/1

constantinpape · 2022-02-28T16:04:41Z

Closed in favor of #94 and #101

constantinpape and others added 4 commits September 7, 2021 15:52

Start extending the axes related fields in multiscales

4703d89

Move axes to its own section

e72ee48

Remove left over axes description from multiscales

212b2dd

Add transformation spec WIP

e5b23ee

constantinpape marked this pull request as draft October 1, 2021 08:41

constantinpape added 2 commits October 1, 2021 10:59

Remove empty type; not necessary for SHOULD req

df835f1

Make number of dim requirement more explicit

b3523c9

constantinpape mentioned this pull request Oct 1, 2021

Extend the axes fields in multiscales metadata #57

Merged

constantinpape added 3 commits October 1, 2021 11:13

Merge branch 'axes-spec' into trafo-spec

86d40fa

Add first draft of trafo spec and use it in multiscales

8d80dc4

Clarify description of transformation and axisIndices

821417f

constantinpape added 2 commits October 7, 2021 22:44

Update index.bs

74e3ca1

Restrict transformations in multiscales

2abd87c

will-moore reviewed Oct 11, 2021

View reviewed changes

latest/index.bs Outdated Show resolved Hide resolved

will-moore reviewed Oct 11, 2021

View reviewed changes

latest/index.bs Outdated Show resolved Hide resolved

Update index.bs

a4c2ac3

thewtex reviewed Oct 15, 2021

View reviewed changes

satra reviewed Oct 18, 2021

View reviewed changes

will-moore reviewed Oct 18, 2021

View reviewed changes

constantinpape added a commit to constantinpape/ngff that referenced this pull request Oct 18, 2021

Add transformation spec with only simple transformations from ome#63

a8597d7

constantinpape changed the title ~~Add transformation spec~~ [Draft] Add transformation spec Oct 18, 2021

kevinyamauchi mentioned this pull request Oct 20, 2021

Table spec proposal #64

Closed

sbesson mentioned this pull request Nov 3, 2021

Add camelCase naming style rule #70

Merged

This was referenced Nov 4, 2021

Large example data from the medical domain; including more complex dataset transformations ome/ome-ngff-prototypes#1

Open

affines and dimensions #74

Closed

constantinpape closed this Feb 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] Add transformation spec #63

[Draft] Add transformation spec #63

constantinpape commented Oct 1, 2021 •

edited

Loading

constantinpape commented Oct 1, 2021

thewtex Oct 15, 2021

constantinpape Oct 18, 2021

axtimwalde Oct 22, 2021

axtimwalde Oct 22, 2021

thewtex Oct 15, 2021 •

edited

Loading

constantinpape Oct 18, 2021

axtimwalde Oct 22, 2021 •

edited

Loading

d-v-b Oct 22, 2021 •

edited

Loading

axtimwalde Oct 22, 2021

constantinpape Oct 26, 2021

thewtex Oct 15, 2021

satra Oct 18, 2021

constantinpape Oct 18, 2021

satra Oct 18, 2021

constantinpape Oct 18, 2021

axtimwalde Oct 22, 2021

constantinpape Oct 26, 2021

thewtex Oct 15, 2021

constantinpape Oct 18, 2021

axtimwalde Oct 22, 2021

satra Oct 18, 2021

constantinpape Oct 18, 2021

bogovicj Oct 18, 2021

bogovicj Oct 18, 2021

satra Oct 18, 2021

constantinpape Oct 18, 2021

constantinpape Oct 18, 2021

will-moore Oct 18, 2021

constantinpape commented Oct 18, 2021

imagesc-bot commented Dec 9, 2021

constantinpape commented Feb 28, 2022

[Draft] Add transformation spec #63

[Draft] Add transformation spec #63

Conversation

constantinpape commented Oct 1, 2021 • edited Loading

constantinpape commented Oct 1, 2021

First draft for separate transformation metadata

Design Choice

Differences to the proposal by John and Stephan

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thewtex Oct 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

axtimwalde Oct 22, 2021 • edited Loading

Choose a reason for hiding this comment

d-v-b Oct 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

constantinpape commented Oct 18, 2021

imagesc-bot commented Dec 9, 2021

constantinpape commented Feb 28, 2022

constantinpape commented Oct 1, 2021 •

edited

Loading

thewtex Oct 15, 2021 •

edited

Loading

axtimwalde Oct 22, 2021 •

edited

Loading

d-v-b Oct 22, 2021 •

edited

Loading