Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: refactoring rotated boxes #731

Merged
merged 36 commits into from
Dec 26, 2021
Merged

refactor: refactoring rotated boxes #731

merged 36 commits into from
Dec 26, 2021

Conversation

charlesmindee
Copy link
Collaborator

@charlesmindee charlesmindee commented Dec 20, 2021

Currently a draft, not to be reviewed (at least not before #723 is merged)

This PR introduces a new format for rotated boxes:
(x, h, w, h, a) --> [[x1 y1]
[x2 y2]
[x3 y3]
[x4 y4]]

It contains much more information, and allows us to simplify many functions in doctr.

The convention is the following: once the reading direction of the crop is determined, the 4 points are ordered this way in the polygon array: top left, top right, bot right, bot left.

Any feedback is welcome!

@charlesmindee charlesmindee self-assigned this Dec 20, 2021
@charlesmindee charlesmindee added type: breaking change Introducing a breaking change type: enhancement Improvement framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend module: datasets Related to doctr.datasets module: io Related to doctr.io module: models Related to doctr.models module: utils Related to doctr.utils topic: text detection Related to the task of text detection labels Dec 20, 2021
@charlesmindee charlesmindee added this to the 0.5.0 milestone Dec 20, 2021
@fg-mindee fg-mindee changed the title feat: refactoring rotated boxes refactor: refactoring rotated boxes Dec 20, 2021
Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I added a few comments

doctr/io/elements.py Outdated Show resolved Hide resolved
doctr/io/elements.py Outdated Show resolved Hide resolved
doctr/models/_utils.py Outdated Show resolved Hide resolved
doctr/models/_utils.py Outdated Show resolved Hide resolved
doctr/models/_utils.py Outdated Show resolved Hide resolved
doctr/models/_utils.py Outdated Show resolved Hide resolved
doctr/models/_utils.py Outdated Show resolved Hide resolved
doctr/models/builder.py Outdated Show resolved Hide resolved
doctr/models/detection/differentiable_binarization/base.py Outdated Show resolved Hide resolved
Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Minor things to fix and we're good for merge!

doctr/models/predictor/base.py Outdated Show resolved Hide resolved
doctr/models/predictor/base.py Outdated Show resolved Hide resolved
@@ -52,6 +53,7 @@ def __init__(
self.doc_builder = DocumentBuilder(export_as_straight_boxes=export_as_straight_boxes)
self.assume_straight_pages = assume_straight_pages
self.straighten_pages = straighten_pages
self.crop_orientation_predictor = crop_orientation_predictor(pretrained=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still missing the .eval(), let's make sure we have all models in eval mode in a predictor

tests/tensorflow/test_models_detection_tf.py Outdated Show resolved Hide resolved
tests/tensorflow/test_transforms_tf.py Outdated Show resolved Hide resolved
@charlesmindee
Copy link
Collaborator Author

the .eval() is already applied to the model in the predictor when you instantiate it. The predictor is a predictor object and not a torch model

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Evaluation CI jobs seem to fail though :/

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much!!

@charlesmindee charlesmindee merged commit e8583f3 into main Dec 26, 2021
@charlesmindee charlesmindee deleted the refacto_polys branch December 26, 2021 15:02
@Rob192
Copy link
Contributor

Rob192 commented Jan 3, 2022

Hello @charlesmindee
You need to keep the parameter target_shape for the function rotate_boxes. And thus, you should not remove the function remap_boxes of utils.geometry. Otherwise the function to straighten pages will not work anymore.

@fg-mindee
Copy link
Contributor

fg-mindee commented Jan 10, 2022

@Rob192 could you elaborate a bit please?
We should extend the corresponding unittests, because none is failing here, so if this PR broke something, I'd argue this should make a unittest fail to spot this more easily!

And if there is still a problem on main, we should definitely fix it for the 0.5.1 patch release 😅 (in that case, please open an issue so that we can track this correctly)

@Rob192
Copy link
Contributor

Rob192 commented Jan 11, 2022

This PR deletes function remap_boxes and the associated tests test_remap_boxes. This is why none of the unittests is failling.

However I agree that we should add more of a functional test that verifies the output of the OCRPredictor, including text box locations. That would ease the release process.

I raised the issue #800

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend module: datasets Related to doctr.datasets module: io Related to doctr.io module: models Related to doctr.models module: utils Related to doctr.utils topic: text detection Related to the task of text detection type: breaking change Introducing a breaking change type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants