-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes in to_tensorflow method #44
Comments
Changing the values in the dict returned by meta would work?
And I did not get this part...
After I comment out the line, how do I test it? |
Yeah should work by changing the dtype in meta, but it's better to change it in call as well, to make it obvious for any user
When the user doesn't specify the shape and just mentions shape = (1,), to_tensorflow strictly expects shape as (1,). If we comment out https://github.com/activeloopai/Hub/blob/master/hub/collections/dataset/core.py#L633, it will accept any shape given to it.
Once you make both of these changes, try storing the dataset using https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py. Also try loading with Pytorch, to ensure that the changes didn't break that:- Let me know if you face any issues. |
Actually @AbhinavTuli by that I meant I did not get this (below) part
So by example, do you mean the example py file (https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py) or example dataset? 😅 |
Essentially both. You'll update the py file and then upload the dataset using the modified file, so both will get updated. |
Makes sense, got it. While making changes in the meta file, you mentioned "unit8" or something similar... The reason, why I chose these values, is because, after a quick google search on coco, I got a feeling that uint8 won't be sufficient.
|
The dtype is essentially similar to numpy dtype that we keep track of as metadata, it helps us in storing chunks of data efficiently (in case chunk_size isn't explicitly mentioned). It also helps us in converting from hub format to other formats as well. The dtypes you chose should be fine as long as the entire range of values fits in them, which I think they will. |
fixing to_tensorflow method to take any shape #44
Observed a couple of problems while converting stored datasets to TensorFlow format that need some small fixes.
to_tensorflow fails when the meta information for a tensor includes dtype="object" ("object" dtype has been used for images, area, id, bbox in Coco dataset - https://github.com/activeloopai/Hub/blob/master/examples/coco/upload_coco2017.py#L24)
A fix for this is to keep the dtype="uint8" or something similar while uploading. The Coco example needs to be updated to reflect this.
to_tensorflow also fails when it gets shape=(1,) in meta and the actual object has multiple dimensions, for example, an image.
This can be fixed by commenting out this line https://github.com/activeloopai/Hub/blob/master/hub/collections/dataset/core.py#L633, which will set the output_shapes as None by default.
to_pytorch works fine in both the above cases.
The text was updated successfully, but these errors were encountered: