Merge branch 'mindee:main' into main

mindee · Nov 15, 2021 · 23eca0d · 23eca0d
2 parents 9099e95 + 2a4e2b4
commit 23eca0d
Showing 10 changed files with 118 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
   <img src="https://github.com/mindee/doctr/releases/download/v0.3.1/Logo_doctr.gif" width="40%">
 </p>
 
-[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) ![Build Status](https://github.com/mindee/doctr/workflows/builds/badge.svg) [![codecov](https://codecov.io/gh/mindee/doctr/branch/main/graph/badge.svg?token=577MO567NM)](https://codecov.io/gh/mindee/doctr) [![CodeFactor](https://www.codefactor.io/repository/github/mindee/doctr/badge?s=bae07db86bb079ce9d6542315b8c6e70fa708a7e)](https://www.codefactor.io/repository/github/mindee/doctr) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/340a76749b634586a498e1c0ab998f08)](https://app.codacy.com/gh/mindee/doctr?utm_source=github.com&utm_medium=referral&utm_content=mindee/doctr&utm_campaign=Badge_Grade) [![Doc Status](https://github.com/mindee/doctr/workflows/doc-status/badge.svg)](https://mindee.github.io/doctr) [![Pypi](https://img.shields.io/badge/pypi-v0.4.0-blue.svg)](https://pypi.org/project/python-doctr/) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/mindee/doctr)
+[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) ![Build Status](https://github.com/mindee/doctr/workflows/builds/badge.svg) [![codecov](https://codecov.io/gh/mindee/doctr/branch/main/graph/badge.svg?token=577MO567NM)](https://codecov.io/gh/mindee/doctr) [![CodeFactor](https://www.codefactor.io/repository/github/mindee/doctr/badge?s=bae07db86bb079ce9d6542315b8c6e70fa708a7e)](https://www.codefactor.io/repository/github/mindee/doctr) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/340a76749b634586a498e1c0ab998f08)](https://app.codacy.com/gh/mindee/doctr?utm_source=github.com&utm_medium=referral&utm_content=mindee/doctr&utm_campaign=Badge_Grade) [![Doc Status](https://github.com/mindee/doctr/workflows/doc-status/badge.svg)](https://mindee.github.io/doctr) [![Pypi](https://img.shields.io/badge/pypi-v0.4.0-blue.svg)](https://pypi.org/project/python-doctr/) [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/mindee/doctr) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mindee/notebooks/blob/main/doctr/quicktour.ipynb)
 
 
 **Optical Character Recognition made seamless & accessible to anyone, powered by TensorFlow 2 & PyTorch**
@@ -225,12 +225,14 @@ Your API should now be running locally on your port 8002. Access your automatica
 ```python
 
 import requests
-import io
 with open('/path/to/your/doc.jpg', 'rb') as f:
     data = f.read()
-response = requests.post("http://localhost:8002/ocr", files={'file': io.BytesIO(data)}).json()
+response = requests.post("http://localhost:8002/ocr", files={'file': data}).json()
 ```
 
+### Example notebooks
+Looking for more illustrations of docTR features? You might want to check the [Jupyter notebooks](https://github.com/mindee/doctr/tree/main/notebooks) designed to give you a broader overview.
+
 
 ## Citation
 

diff --git a/api/README.md b/api/README.md
@@ -0,0 +1,92 @@
+# Template for your OCR API using docTR
+
+## Installation
+
+You will only need to install [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) and [Docker](https://docs.docker.com/get-docker/). The container environment will be self-sufficient and install the remaining dependencies on its own.
+
+## Usage
+
+### Starting your web server
+
+You will need to clone the repository first:
+```shell
+git clone https://github.com/mindee/doctr.git
+```
+then from the repo root folder, you can start your container:
+
+```shell
+PORT=8050 docker-compose up -d --build
+```
+Once completed, your [FastAPI](https://fastapi.tiangolo.com/) server should be running on port 8050 (feel free to change this in the previous command).
+
+### Documentation and swagger
+
+FastAPI comes with many advantages including speed and OpenAPI features. For instance, once your server is running, you can access the automatically built documentation and swagger in your browser at: http://localhost:8050/docs
+
+
+### Using the routes
+
+You will find detailed instructions in the live documentation when your server is up, but here are some examples to use your available API routes:
+
+#### Text detection
+
+Using the following image:
+<img src="https://user-images.githubusercontent.com/76527547/117319856-fc35bf00-ae8b-11eb-9b51-ca5aba673466.jpg" width="50%" height="50%">
+
+with this snippet:
+
+```python
+import requests
+with open('/path/to/your/img.jpg', 'rb') as f:
+    data = f.read()
+print(requests.post("http://localhost:8050/detection", files={'file': data}).json())
+```
+
+should yield
+```
+[{'box': [0.826171875, 0.185546875, 0.90234375, 0.201171875]},
+ {'box': [0.75390625, 0.185546875, 0.8173828125, 0.201171875]}]
+```
+
+
+#### Text recognition
+
+Using the following image:
+![recognition-sample](https://user-images.githubusercontent.com/76527547/117133599-c073fa00-ada4-11eb-831b-412de4d28341.jpeg)
+
+with this snippet:
+
+```python
+import requests
+with open('/path/to/your/img.jpg', 'rb') as f:
+    data = f.read()
+print(requests.post("http://localhost:8050/recognition", files={'file': data}).json())
+```
+
+should yield
+```
+{'value': 'invite'}
+```
+
+
+#### End-to-end OCR
+
+Using the following image:
+<img src="https://user-images.githubusercontent.com/76527547/117319856-fc35bf00-ae8b-11eb-9b51-ca5aba673466.jpg" width="50%" height="50%">
+
+with this snippet:
+
+```python
+import requests
+with open('/path/to/your/img.jpg', 'rb') as f:
+    data = f.read()
+print(requests.post("http://localhost:8050/ocr", files={'file': data}).json())
+```
+
+should yield
+```
+[{'box': [0.75390625, 0.185546875, 0.8173828125, 0.201171875],
+  'value': 'Hello'},
+ {'box': [0.826171875, 0.185546875, 0.90234375, 0.201171875],
+  'value': 'world!'}]
+```
diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -3,3 +3,5 @@ sphinx-rtd-theme==0.4.3
 sphinxemoji>=0.1.8
 sphinx-copybutton>=0.3.1
 docutils<0.18
+recommonmark>=0.7.1
+sphinx-markdown-tables>=0.0.15
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -45,6 +45,8 @@
     'sphinx.ext.autosectionlabel',
     'sphinxemoji.sphinxemoji',  # cf. https://sphinxemojicodes.readthedocs.io/en/stable/
     'sphinx_copybutton',
+    'recommonmark',
+    'sphinx_markdown_tables',
 ]
 
 napoleon_use_ivar = True
@@ -55,7 +57,7 @@
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
-exclude_patterns = [u'_build', 'Thumbs.db', '.DS_Store']
+exclude_patterns = [u'_build', 'Thumbs.db', '.DS_Store', 'notebooks/*.rst']
 
 
 # The name of the Pygments (syntax highlighting) style to use.

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -31,6 +31,7 @@ Main Features
    :hidden:
 
    installing
+   notebooks
 
 
 Model zoo

diff --git a/docs/source/notebooks.md b/docs/source/notebooks.md
@@ -0,0 +1 @@
+../../notebooks/README.md
diff --git a/doctr/models/recognition/sar/pytorch.py b/doctr/models/recognition/sar/pytorch.py
@@ -98,7 +98,7 @@ def forward(
 
         # initialize states (each of shape (N, rnn_units))
         hx = [None, None]
-        # Initialize with the index of virtual START symbol (placed after <eos>)
+        # Initialize with the index of virtual START symbol (placed after <eos> so that the one-hot is only zeros)
         symbol = torch.zeros((features.shape[0], self.vocab_size + 1), device=features.device, dtype=features.dtype)
         logits_list = []
         for t in range(self.max_length + 1):  # keep 1 step for <eos>

diff --git a/doctr/models/recognition/sar/tensorflow.py b/doctr/models/recognition/sar/tensorflow.py
@@ -132,7 +132,7 @@ def call(
         # run first step of lstm
         # holistic: shape (N, rnn_units)
         _, states = self.lstm_decoder(holistic, states, **kwargs)
-        # Initialize with the index of virtual START symbol (placed after <eos>)
+        # Initialize with the index of virtual START symbol (placed after <eos> so that the one-hot is only zeros)
         symbol = tf.fill(features.shape[0], self.vocab_size + 1)
         logits_list = []
         if kwargs.get('training') and gt is None:

diff --git a/notebooks/README.md b/notebooks/README.md
@@ -0,0 +1,8 @@
+# docTR Notebooks
+
+Here are some notebooks compiled for users to better leverage the library capabilities:
+
+| Notebook     |      Description      |   |
+|:----------|:-------------|------:|
+| [Quicktour](https://github.com/mindee/notebooks/blob/main/doctr/quicktour.ipynb) | A presentation of the main features of docTR | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mindee/notebooks/blob/main/doctr/quicktour.ipynb) |
+
diff --git a/setup.py b/setup.py
@@ -74,6 +74,8 @@
     "sphinxemoji>=0.1.8",
     "sphinx-copybutton>=0.3.1",
     "docutils<0.18",
+    "recommonmark>=0.7.1",
+    "sphinx-markdown-tables>=0.0.15",
 ]
 
 deps = {b: a for a, b in (re.findall(r"^(([^!=<>]+)(?:[!=<>].*)?$)", x)[0] for x in _deps)}
@@ -142,6 +144,8 @@ def deps_list(*pkgs):
     "sphinxemoji",
     "sphinx-copybutton",
     "docutils",
+    "recommonmark",
+    "sphinx-markdown-tables",
 )
 
 extras["docs"] = extras["all"] + extras["docs_specific"]