Fixes in to_tensorflow method #44

AbhinavTuli · 2020-09-21T06:00:02Z

Observed a couple of problems while converting stored datasets to TensorFlow format that need some small fixes.

to_tensorflow fails when the meta information for a tensor includes dtype="object" ("object" dtype has been used for images, area, id, bbox in Coco dataset - https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py#L24)
A fix for this is to keep the dtype="uint8" or something similar while uploading. The Coco example needs to be updated to reflect this.

to_tensorflow also fails when it gets shape=(1,) in meta and the actual object has multiple dimensions, for example, an image.
This can be fixed by commenting out this line https://github.com/activeloopai/Hub/blob/master/hub/collections/dataset/core.py#L633, which will set the output_shapes as None by default.

to_pytorch works fine in both the above cases.

ADI10HERO · 2020-10-13T18:59:36Z

to_tensorflow fails when the meta information for a tensor includes dtype="object" ("object" dtype has been used for images, area, id, bbox in Coco dataset - https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py#L24)
A fix for this is to keep the dtype="uint8" or something similar while uploading. The Coco example needs to be updated to reflect this.

Changing the values in the dict returned by meta would work?

The Coco example needs to be updated to reflect this.

And I did not get this part...

to_tensorflow also fails when it gets shape=(1,) in meta and the actual object has multiple dimensions, for example, an image.
This can be fixed by commenting out this line https://github.com/activeloopai/Hub/blob/master/hub/collections/dataset/core.py#L633, which will set the output_shapes as None by default.

After I comment out the line, how do I test it?

AbhinavTuli · 2020-10-14T05:42:22Z

Changing the values in the dict returned by meta would work?

Yeah should work by changing the dtype in meta, but it's better to change it in call as well, to make it obvious for any user

And I did not get this part...

When the user doesn't specify the shape and just mentions shape = (1,), to_tensorflow strictly expects shape as (1,). If we comment out https://github.com/activeloopai/Hub/blob/master/hub/collections/dataset/core.py#L633, it will accept any shape given to it.

After I comment out the line, how do I test it?

Once you make both of these changes, try storing the dataset using https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py.
(Tip:- if you use ds.store("./path/to/directory"), the dataset will get stored locally, instead of online, might save you some time)
Once you have stored it, try loading it with a file similar to:- https://github.com/activeloopai/Hub/blob/master/examples/load_tf.py
If everything is fine, you shouldn't face any issues in loading.

Also try loading with Pytorch, to ensure that the changes didn't break that:-
https://github.com/activeloopai/Hub/blob/master/examples/load_pytorch.py

Let me know if you face any issues.

ADI10HERO · 2020-10-22T21:29:36Z

And I did not get this part...

Actually @AbhinavTuli by that I meant I did not get this (below) part

The Coco example needs to be updated to reflect this

So by example, do you mean the example py file (https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py) or example dataset? 😅

AbhinavTuli · 2020-10-23T03:42:38Z

Essentially both. You'll update the py file and then upload the dataset using the modified file, so both will get updated.

ADI10HERO · 2020-10-23T22:07:57Z

Makes sense, got it.

While making changes in the meta file, you mentioned "unit8" or something similar...
The changes I currently made (on my local system are)
area --> uint32
bbox --> uint16
id --> uint32
image --> uint32

The reason, why I chose these values, is because, after a quick google search on coco, I got a feeling that uint8 won't be sufficient.
But then, I am not sure what these values mean, so here I have 2 questions

Are my dtype values correct/okay, if not what should they be?
What is their significance?

AbhinavTuli · 2020-10-24T05:04:51Z

The dtype is essentially similar to numpy dtype that we keep track of as metadata, it helps us in storing chunks of data efficiently (in case chunk_size isn't explicitly mentioned). It also helps us in converting from hub format to other formats as well.

The dtypes you chose should be fine as long as the entire range of values fits in them, which I think they will.
Try saving the dataset locally to see if everything is working. It'll save you some time. Use ds.store("./path") for this.
Once everything is working you can upload to hub.

…oopai#90

fixing to_tensorflow method to take any shape #44

AbhinavTuli added hacktoberfest good first issue Good for newcomers labels Oct 9, 2020

AbhinavTuli changed the title ~~Problems with to_tensorflow~~ Fixes in to_tensorflow method Oct 9, 2020

prithviraj-maurya added a commit to prithviraj-maurya/Hub that referenced this issue Oct 17, 2020

fixing to_tensorflow method to take any shape activeloopai#44

1bcc7d3

prithviraj-maurya mentioned this issue Oct 17, 2020

fixing to_tensorflow method to take any shape #44 #90

Merged

prithviraj-maurya added a commit to prithviraj-maurya/Hub that referenced this issue Oct 17, 2020

adding dtypes to COCO examples activeloopai#44

9de6e20

davidbuniat assigned ADI10HERO Oct 19, 2020

prithviraj-maurya added a commit to prithviraj-maurya/Hub that referenced this issue Oct 24, 2020

fixing to_tensorflow method to take any shape activeloopai#44 activel…

8dd2222

…oopai#90

ADI10HERO mentioned this issue Oct 25, 2020

Fix to_tensorflow bugs #140

Merged

AbhinavTuli mentioned this issue Oct 29, 2020

add upload.py #146

Merged

AbhinavTuli closed this as completed in #90 Oct 30, 2020

AbhinavTuli added a commit that referenced this issue Oct 30, 2020

Merge pull request #90 from prithviraj-maurya/master

f0df480

fixing to_tensorflow method to take any shape #44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes in to_tensorflow method #44

Fixes in to_tensorflow method #44

AbhinavTuli commented Sep 21, 2020 •

edited

Loading

ADI10HERO commented Oct 13, 2020

AbhinavTuli commented Oct 14, 2020

ADI10HERO commented Oct 22, 2020

AbhinavTuli commented Oct 23, 2020

ADI10HERO commented Oct 23, 2020

AbhinavTuli commented Oct 24, 2020

Fixes in to_tensorflow method #44

Fixes in to_tensorflow method #44

Comments

AbhinavTuli commented Sep 21, 2020 • edited Loading

ADI10HERO commented Oct 13, 2020

AbhinavTuli commented Oct 14, 2020

ADI10HERO commented Oct 22, 2020

AbhinavTuli commented Oct 23, 2020

ADI10HERO commented Oct 23, 2020

AbhinavTuli commented Oct 24, 2020

AbhinavTuli commented Sep 21, 2020 •

edited

Loading