Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only generate dir for HOCR when needed - fixes #208 #223

Closed
wants to merge 2 commits into from

Conversation

tfmorris
Copy link
Contributor

Takes advantage of inheritance and default of dir="ltr" to:

  • only generate paragraph dir attributes which are not ltr
  • only generate word dir attributes which don't match enclosing paragraph

Tested against LTR, RTL, and mixed direction files. Files for the latter two cases are in a separate PR

@amitdo
Copy link
Collaborator

amitdo commented Feb 14, 2016

Tom,
I suggest changing ltr to para_is_ltr, to make the code more clear.

@tfmorris tfmorris force-pushed the hocr-dir-compression branch from 8235caa to 4c90595 Compare February 15, 2016 18:43
@tfmorris
Copy link
Contributor Author

OK. I was hoping the scheme could be extended to pages or careas later, but I've made the suggested change and updated the branch.

@tfmorris tfmorris force-pushed the hocr-dir-compression branch from 4c90595 to d1c40e8 Compare February 16, 2016 23:37
Takes advantage of inheritance and dir="ltr" default to:
 - only generate paragraph dirs which are not ltr
 - only generate word dirs which don't match enclosing paragraph

Tested against LTR, RTL, and mixed direction files. Files for the
latter two cases are in a separate commit on the ltr-test-files branch.
@tfmorris tfmorris force-pushed the hocr-dir-compression branch from 6b23dfb to 893a9aa Compare February 17, 2016 15:21
@tfmorris
Copy link
Contributor Author

Rebased against current head and added fix for Microsoft build breakage introduced by #226. Apply this before #224.

@tfmorris tfmorris closed this Feb 17, 2016
@tfmorris
Copy link
Contributor Author

Closing in favor of a merged PR which incorporates both #223 and #224

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants