Skip to content

Commit

Permalink
merge
Browse files Browse the repository at this point in the history
  • Loading branch information
davidbuniat committed Dec 1, 2020
2 parents 54d096e + a6a604b commit 0ff39cd
Show file tree
Hide file tree
Showing 49 changed files with 261 additions and 229 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,14 +89,14 @@ hub login

2. Then create a dataset and upload
```python
from hub import Dataset, features
from hub import Dataset, schema
import numpy as np

ds = Dataset(
"username/basic",
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)

Expand Down
26 changes: 13 additions & 13 deletions docs/source/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,59 +53,59 @@
:special-members:
```

## Features
## Schema
### Serialization
```eval_rst
.. automodule:: hub.features.serialize
.. automodule:: hub.schema.serialize
:members:
```
### Features
### Schema
```eval_rst
.. autoclass:: hub.features.audio.Audio
.. autoclass:: hub.schema.audio.Audio
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.bbox.BBox
.. autoclass:: hub.schema.bbox.BBox
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.class_label.ClassLabel
.. autoclass:: hub.schema.class_label.ClassLabel
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.image.Image
.. autoclass:: hub.schema.image.Image
:members:
:no-undoc-members:
:private-members:
:special-members:
.. automodule:: hub.features.features
.. automodule:: hub.schema.features
:members:
:private-members:
:special-members:
.. autoclass:: hub.features.mask.Mask
.. autoclass:: hub.schema.mask.Mask
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.polygon.Polygon
.. autoclass:: hub.schema.polygon.Polygon
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.segmentation.Segmentation
.. autoclass:: hub.schema.segmentation.Segmentation
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.sequence.Sequence
.. autoclass:: hub.schema.sequence.Sequence
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.video.Video
.. autoclass:: hub.schema.video.Video
:members:
:no-undoc-members:
:private-members:
Expand Down
6 changes: 3 additions & 3 deletions docs/source/concepts/dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@ To create and store dataset you would need to define shape and specify the datas
For example, to create a dataset `basic` with 4 samples containing images and labels with shape (512, 512) of dtype 'float' in account `username`:

```python
from hub import Dataset, features
from hub import Dataset, schema
tag = "username/basic"

ds = Dataset(
tag,
shape=(4,),
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)
```
Expand Down
34 changes: 17 additions & 17 deletions docs/source/concepts/features.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,22 @@

## Overview

Hub features:
Hub Schema:

- Define the structure, shapes, dtypes of the final Dataset
- Add additional meta information(image channels, class names, etc.)
- Use special serialization/deserialization methods



## Available Features
## Available Schemas

### Primitive

Wrapper to the numpy primitive data types like int32, float64, etc...

```python
from hub.features import Primitive
from hub.schema import Primitive

schema = { "scalar": Primitive(dtype="float32") }
```
Expand All @@ -27,7 +27,7 @@ schema = { "scalar": Primitive(dtype="float32") }
Np-array like structure that contains any type of elements (Primitive and non-Primitive).

```python
from hub.features import Tensor
from hub.schema import Tensor

schema = {"tensor_1": Tensor((100, 200), "int32"),
"tensor_2": Tensor((100, 400), "int64", chunks=(6, 50, 200)) }
Expand All @@ -40,7 +40,7 @@ Array representation of image of arbitrary shape and primitive data type.
Default encoding format - `png` (`jpeg` is also supported).

```python
from hub.features import Image
from hub.schema import Image

schema = {"image": Image(shape=(None, None),
dtype="int32",
Expand All @@ -53,7 +53,7 @@ schema = {"image": Image(shape=(None, None),
Integer representation of feature labels. Can be constructed from number of labels, label names or a text file with a single label name in each line.

```python
from hub.features import ClassLabel
from hub.schema import ClassLabel

schema = {"class_label_1": ClassLabel(num_classes=10),
"class_label_2": ClassLabel(names=['class1', 'class2', 'class3', ...]),
Expand All @@ -66,7 +66,7 @@ schema = {"class_label_1": ClassLabel(num_classes=10),
Array representation of binary mask. The shape of mask should have format: (height, width, 1).

```python
from hub.features import Image
from hub.schema import Image

schema = {"mask": Mask(shape=(244, 244, 1))}
```
Expand Down Expand Up @@ -105,51 +105,51 @@ Argument `chunks` describes how to split tensor dimensions into chunks (files) t

## API
```eval_rst
.. autoclass:: hub.features.audio.Audio
.. autoclass:: hub.schema.audio.Audio
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.bbox.BBox
.. autoclass:: hub.schema.bbox.BBox
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.class_label.ClassLabel
.. autoclass:: hub.schema.class_label.ClassLabel
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.image.Image
.. autoclass:: hub.schema.image.Image
:members:
:no-undoc-members:
:private-members:
:special-members:
.. automodule:: hub.features.features
.. automodule:: hub.schema.features
:members:
:private-members:
:special-members:
.. autoclass:: hub.features.mask.Mask
.. autoclass:: hub.schema.mask.Mask
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.polygon.Polygon
.. autoclass:: hub.schema.polygon.Polygon
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.segmentation.Segmentation
.. autoclass:: hub.schema.segmentation.Segmentation
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.sequence.Sequence
.. autoclass:: hub.schema.sequence.Sequence
:members:
:no-undoc-members:
:private-members:
:special-members:
.. autoclass:: hub.features.video.Video
.. autoclass:: hub.schema.video.Video
:members:
:no-undoc-members:
:private-members:
Expand Down
4 changes: 2 additions & 2 deletions docs/source/integrations/pytorch.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ ds = Dataset(
shape=(640,),
mode="w",
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)

Expand Down
4 changes: 2 additions & 2 deletions docs/source/integrations/tensorflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ ds = Dataset(
"username/tensorflow_example",
shape=(64,),
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)

Expand Down
6 changes: 2 additions & 4 deletions docs/source/simple.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,8 @@ Here is some features of new hub:
2. Larger datasets can now be uploaded as we removed some RAM limiting components from the hub.
3. Caching is introduced to improve IO performance.
4. Dynamic shaping enables very large images/data support. You can have large images/data stored in hub.

More features coming:
1. Dynamically sized datasets. Soon you will be able to increase number of samples dynamically.
2. Tensors can be added to dataset on the fly.
5. Dynamically sized datasets. You will be able to increase number of samples dynamically.
6. Tensors can be added to dataset on the fly.

### Getting Started

Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorials/new_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ hub login
import numpy as np

import hub
from hub.features import ClassLabel, Image
from hub.schema import ClassLabel, Image

schema = {
"image": Image((28, 28)),
Expand Down
6 changes: 3 additions & 3 deletions examples/basic.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from hub import Dataset, features
from hub import Dataset, schema
import numpy as np


Expand All @@ -11,8 +11,8 @@ def main():
tag,
shape=(4,),
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)

Expand Down
2 changes: 1 addition & 1 deletion examples/big_image.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import numpy as np

import hub
from hub.features import Image
from hub.schema import Image
from hub.utils import Timer


Expand Down
2 changes: 1 addition & 1 deletion examples/large_dataset_build.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import numpy as np
import hub
from hub.features import Tensor
from hub.schema import Tensor


def create_large_dataset():
Expand Down
6 changes: 3 additions & 3 deletions examples/load_pytorch.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import torch
from hub import Dataset, features
from hub import Dataset, schema


def main():
Expand All @@ -9,8 +9,8 @@ def main():
shape=(640,),
mode="w",
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)
# ds["image"][:] = 1
Expand Down
6 changes: 3 additions & 3 deletions examples/load_tf.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from hub import Dataset, features
from hub import Dataset, schema


def main():
Expand All @@ -7,8 +7,8 @@ def main():
"./data/example/pytorch",
shape=(64,),
schema={
"image": features.Tensor((512, 512), dtype="float"),
"label": features.Tensor((512, 512), dtype="float"),
"image": schema.Tensor((512, 512), dtype="float"),
"label": schema.Tensor((512, 512), dtype="float"),
},
)

Expand Down
2 changes: 1 addition & 1 deletion examples/mnist_upload_speed_benchmark.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import numpy as np

import hub
from hub.features import Image, ClassLabel
from hub.schema import Image, ClassLabel
from hub.utils import Timer


Expand Down
2 changes: 1 addition & 1 deletion examples/new_api_intro.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import numpy as np

from hub import Dataset
from hub.features import ClassLabel, Image
from hub.schema import ClassLabel, Image
from hub.utils import Timer


Expand Down
11 changes: 7 additions & 4 deletions examples/upload_tfds.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,14 @@
import hub
from hub.utils import Timer
from hub import dev_mode

dev_mode()

if __name__ == "__main__":
path = "./data/test/tfds_new/coco"
# path = "s3://snark-test/coco_dataset"
path = "./data/test/coco"
with Timer("Eurosat TFDS"):
out_ds = hub.Dataset.from_tfds("coco", num=100)
out_ds = hub.Dataset.from_tfds("coco", num=1000)

res_ds = out_ds.store(path)
ds = hub.load(path)
print(ds)
print(ds["image", 0].compute())
Loading

0 comments on commit 0ff39cd

Please sign in to comment.