Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stirling-PDF OCR not working #368222

Open
David-Kopczynski opened this issue Dec 25, 2024 · 2 comments
Open

Stirling-PDF OCR not working #368222

David-Kopczynski opened this issue Dec 25, 2024 · 2 comments
Labels
0.kind: bug Something is broken

Comments

@David-Kopczynski
Copy link
Contributor

Describe the bug

Stirling-PDF does not support OCR out of the box. While checkboxes for various languages are displayed, an internal server error prompts failure upon execution. Looking into the configuration, the Tessdata (datasets for OCR) seems to be expected to be in /usr/share/tessdata.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Install Stirling-PDF with services.stirling-pdf.enable = true;
  2. Go to http://localhost:8080/ocr-pdf
  3. Check any box
  4. Press the button Process PDF with OCR

Expected behavior

Installations using Docker worked without any problems, after Tessdata has been installed. However, this is a manual process. I would expect this package to be bundled with the data by default or expose an additional flag for activation.

Screenshots

Error

Metadata

~ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.6.67, NixOS, 24.11 (Vicuna), 24.11.711815.1807c2b91223`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.24.10`
 - channels(root): `"home-manager-24.11.tar.gz, nixos-24.11, nixos-hardware, nixos-unstable"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Notify maintainers

@TomaSajt


Note for maintainers: Please tag this issue in your PR.


Add a 👍 reaction to issues you find important.

@David-Kopczynski David-Kopczynski added the 0.kind: bug Something is broken label Dec 25, 2024
@TomaSajt
Copy link
Contributor

CC: @DCsunset (maintainer of the nixos module)

@DCsunset
Copy link
Member

DCsunset commented Jan 10, 2025

@David-Kopczynski Sorry for the late reply. May I ask which version of stirling-pdf you are using?

When I deploy the service, the page showed Error: 403 Forbidden This endpoint is disabled for path: /ocr-pdf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken
Projects
None yet
Development

No branches or pull requests

3 participants