Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

embed pdf.ttf to tesseract library #2551 #3194

Merged
merged 2 commits into from
Dec 25, 2020
Merged

embed pdf.ttf to tesseract library #2551 #3194

merged 2 commits into from
Dec 25, 2020

Conversation

zdenop
Copy link
Contributor

@zdenop zdenop commented Dec 25, 2020

No description provided.

@stweil
Copy link
Member

stweil commented Dec 25, 2020

We could also create pdf_ttf.h during the build process instead of adding it to the sources.

@zdenop
Copy link
Contributor Author

zdenop commented Dec 25, 2020

This would requires additional multi-platforms tool - just for build. I think easier is to provide generated file (without removing pdf.ttf).

@stweil
Copy link
Member

stweil commented Dec 25, 2020

Yes, pdf.ttf should not be removed. The build could be done using either Perl, Python3 or sed. All of them are available for the relevant platforms, and we already use Python3 for training. Do you think it would harm using it for the build, too?

@zdenop
Copy link
Contributor Author

zdenop commented Dec 25, 2020

Personally I do not see any benefit from this (maybe I miss something): instead of final and tested header file you propose to include script (that we do not have)?. I prefer possibility to build tesseract with minimal extra dependencies. To install python (perl?) just to generate one small file is IMO overkill.

I prefer to adjust "final product" instead of tweaking script - e.g. my first commit work for me on windows (clang, msvs) and linux (opensuses), but appveyor and travis were not happy with it...

@amitdo
Copy link
Collaborator

amitdo commented Dec 25, 2020

Both ways have their advantages and disadvantages.

This specific file does not change very often, so I think it's fine to go with the manually generated way.

@amitdo amitdo merged commit e1a3479 into tesseract-ocr:master Dec 25, 2020
@amitdo
Copy link
Collaborator

amitdo commented Dec 25, 2020

Thanks! :-)

@amitdo amitdo added the PDF label Dec 25, 2020
@@ -623,24 +625,21 @@ bool TessPDFRenderer::BeginDocumentHandler() {

stream.str("");
stream << datadir_.c_str() << "/pdf.ttf";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be cleaner to remove the code for reading pdf.ttf and remove that file from installations, too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave it there if somebody would like to adapt and test pdf.ttf

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be sufficient to enable that feature only for DEBUG code? For most users (and that includes all Linux distributions) installing pdf.ttf is not necessary, and I'd prefer to remove it from the normal installation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using it in DEBUG code only it good idea.

@zdenop zdenop deleted the i2551 branch December 27, 2020 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants