Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract hangs when you run 2 or more instances of them at the same time. #4280

Closed
anesuc opened this issue Jul 9, 2024 · 6 comments
Closed
Labels

Comments

@anesuc
Copy link

anesuc commented Jul 9, 2024

Current Behavior

In Ubuntu 22.04.6 LTS it supports Tesseract 4.1.1 and my code executing multiple instances of Tesseract on that machine worked. But when I tried it on a newer release (Ubuntu 24.04 with Tesseract 5.3.4) this problem started occurring. To test this I opened to bash terminals and ran the command directly manually one after the other (around the same time) and it does indeed hang. Only after I ctrl+c one of them does the other finish.

Expected Behavior

Should be able to run multiple instances of Tesseract at the same time. At least 2.

Suggested Fix

Seems to be a new bug that go introduced at one point?

tesseract -v

New version with the issue:
tesseract 5.3.4
leptonica-1.82.0
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 2.1.5) : libpng 1.6.43 : libtiff 4.5.1 : zlib 1.3 : libwebp 1.3.2 : libopenjp2 2.5.0
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found OpenMP 201511
Found libarchive 3.7.2 zlib/1.3 liblzma/5.4.5 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.5
Found libcurl/8.5.0 OpenSSL/3.0.13 zlib/1.3 brotli/1.1.0 zstd/1.5.5 libidn2/2.3.7 libpsl/0.21.2 (+libidn2/2.3.7) libssh/0.10.6/openssl/zlib nghttp2/1.59.0 librtmp/2.3 OpenLDAP/2.6.7

Old version last known working:
tesseract 4.1.1
leptonica-1.79.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4

Operating System

Ubuntu 24.04 Noble

Other Operating System

No response

uname -a

Linux ip-172-31-13-132 6.8.0-1009-aws #9-Ubuntu SMP Fri May 17 14:39:23 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Compiler

No response

CPU

No response

Virtualization / Containers

No response

Other Information

No response

@zdenop
Copy link
Contributor

zdenop commented Jul 9, 2024

please provide minimal example for reproducing problem

@stweil
Copy link
Member

stweil commented Jul 9, 2024

Do both instances hang "forever", or do they terminate after a longer time? And do they use CPU time while they are hanging?

@stweil
Copy link
Member

stweil commented Jul 9, 2024

I run up to 64 instances of the latest Tesseract at the same time without any problem, but I have Debian bookworm and compiled Tesseract myself.

@anesuc
Copy link
Author

anesuc commented Jul 9, 2024

I run up to 64 instances of the latest Tesseract at the same time without any problem, but I have Debian bookworm and compiled Tesseract myself.

I ran it and waited and they did eventually finish in 4 mins (when running 2 of them. I have waited longer before but i was running more than 2). If you run one of them they finish in like 4 seconds. This is on a "t2.xlarge" AWS instance which has "4 vCPUs". The previous one I had did have an AMD with 16 cores so maybe that is why it was fine.

So this does not scale very well with multiple instances it seems? Or AWS just has weak CPUs?

@stweil
Copy link
Member

stweil commented Jul 9, 2024

Try export OMP_THREAD_LIMIT=1 before starting both processes. This will disable the default multithreading and should allow up to four instances on your AWS instance. Two Tesseract processes with multithreading are definitely too much for a host which can only run four threads.

@anesuc
Copy link
Author

anesuc commented Jul 9, 2024

Try export OMP_THREAD_LIMIT=1 before starting both processes. This will disable the default multithreading and should allow up to four instances on your AWS instance. Two Tesseract processes with multithreading are definitely too much for a host which can only run four threads.

Alright this seems to have fixed the issue! Thanks for helping resolve this!

@anesuc anesuc closed this as completed Jul 9, 2024
@stweil stweil added the question label Jul 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants