You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it possible to inform a transformation what is the initial visibility of each bounding box?
As far as I know, the transformation always assume that the objects are 100% visible in the before the transformation. But in real life, that is not always the case
Additional Context
I work with object detection on very high resolution images. As a preprocessing step, the images of the training dataset have to be sliced before they can be used by the model. During this preprocessing, the visibility of many bounding boxes become less than 100%. Of course, I can calculate those values, but is there a way to use them with albumentations?
The text was updated successfully, but these errors were encountered:
If this is not possible, a workaround could be achieved if albumentation returns me the "perceived" visibility after the transformation. In that way I could calculate the "real" visibility as the multiplication of the initial and the perceived visibilities.
As I understand, you crop parts from the image and bounding boxes that are not 100% contained in the image get truncated, right? And this becomes an issue.
Yes, you are correct. They get truncated. Therefore their visibility is not 100% to start with.
Unfortunately I don't think providing code will make things any clearer, because this is more of an workflow problem. So let me give an hypothetical situation:
1 - Imagine my dataset have images 2000x2000 px.
2 - My model just works with images of 500x500.
3 - Since I need the full resolution to identify the objects, I should not shrink the images. What I do instead is to slice the full res image (2000x2000) into 16 non-overlapping 500x500 patches.
4 - Now, imagine that there is a 2000x2000 image with two objects. During the slicing process, one object gets cut in half. The other stays fully visible in a single patch.
Everything up to this point happens before training the model. Its dataset pre-processing and has nothing to do with Albumentations.
5 - Now I'll start training a model and use albumentations. Then comes the question: Given that I want to work with minimal visibility of 40%, which value of min_visibility should I give to albumentations?
If I use 40%, the object that was cut in half may end up being only 20% visible (40% of 50%)
If I use a higher value, that would be too conservative for the object that stayed fully visible.
My Question
Is it possible to inform a transformation what is the initial visibility of each bounding box?
As far as I know, the transformation always assume that the objects are 100% visible in the before the transformation. But in real life, that is not always the case
Additional Context
I work with object detection on very high resolution images. As a preprocessing step, the images of the training dataset have to be sliced before they can be used by the model. During this preprocessing, the visibility of many bounding boxes become less than 100%. Of course, I can calculate those values, but is there a way to use them with albumentations?
The text was updated successfully, but these errors were encountered: