Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL error codes, then junk output -- possibly a build issue? #124

Closed
axfelix opened this issue Oct 29, 2015 · 13 comments
Closed

OpenCL error codes, then junk output -- possibly a build issue? #124

axfelix opened this issue Oct 29, 2015 · 13 comments

Comments

@axfelix
Copy link

axfelix commented Oct 29, 2015

Hi folks,

Having trouble getting a working OpenCL build. First run in a new working directory gets upset that there's no opencl_profile_devices.dat:

axfelix@shoebox:~/Desktop$ tesseract test.png out pdf
Tesseract Open Source OCR Engine v3.05.00dev with Leptonica
[DS] Profile file not available (tesseract_opencl_profile_devices.dat); performing profiling.

[DS] Device: "GeForce GTX TITAN X" (OpenCL) evaluation...
[OD] write binary[kernel-GeForce_GTX_TITAN_X.bin] succesfully
[DS] Device: "GeForce GTX TITAN X" (OpenCL) evaluated
[DS] composeRGBPixel: 0.022978 (w=1.2)
[DS] HistogramRect: 0.012700 (w=2.4)
[DS] ThresholdRectToPix: 0.010093 (w=4.5)
[DS] getLineMasksMorph: 0.004402 (w=5.0)
[DS] Score: 0.125484

[DS] Device: "(null)" (Native) evaluation...
[DS] Device: "(null)" (Native) evaluated
[DS] composeRGBPixel: 0.018301 (w=1.2)
[DS] HistogramRect: 0.060059 (w=2.4)
[DS] ThresholdRectToPix: 0.016802 (w=4.5)
[DS] getLineMasksMorph: 0.110437 (w=5.0)
[DS] Score: 0.793896
[DS] Scores written to file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX TITAN X score is 0.125484
[DS] Device[2] 0:(null) score is 0.793896
[DS] Selected Device[1]: "GeForce GTX TITAN X" (OpenCL)
OpenCL error code is -38 at when clSetKernelArg imageBuffer .
OpenCL error code is -38 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -38 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -38 at when clSetKernelArg histogramBuffer .
OpenCL error code is -52 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannels .
OpenCL error code is -52 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannelsReduction .
OpenCL error code is -38 at when clSetKernelArg imageBuffer .
OpenCL error code is -38 at when clSetKernelArg thresholdsBuffer .
OpenCL error code is -38 at when clSetKernelArg hiValuesBuffer .
OpenCL error code is -38 at when clSetKernelArg pixThBuffer .
OpenCL error code is -52 at when clEnqueueNDRangeKernel kernel_ThresholdRectToPix .
Setting return value to -1
OpenCL error code is -38 at when clSetKernelArg imageBuffer .
OpenCL error code is -38 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -38 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -38 at when clSetKernelArg histogramBuffer .
OpenCL error code is -52 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannels .
OpenCL error code is -52 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannelsReduction .

After that, it works (while throwing issues like Error in boxClipToRectangle: box outside rectangle
Error in pixScanForForeground: invalid box) but the output is basically junk. Did I screw something up when compiling?

@markboettcher
Copy link

I'm seeing something very similar. First time I run Tesseract after deleting tesseract_opencl_profile_devices.dat, I get the following:

_START_**
-bash-4.2$ tesseract ESub.png ESubPng
Tesseract Open Source OCR Engine v3.04.00 with Leptonica
[DS] Profile file not available (tesseract_opencl_profile_devices.dat); performing profiling.

[DS] Device: "Tesla K40c" (OpenCL) evaluation...
OpenCL error code is -44 at when clCreateKernel .
OpenCL error code is -48 at when clSetKernelArg .
OpenCL error code is -48 at when clSetKernelArg .
OpenCL error code is -48 at when clSetKernelArg .
OpenCL error code is -48 at when clSetKernelArg .
OpenCL error code is -48 at when clSetKernelArg .
OpenCL error code is -48 at when clEnqueueNDRangeKernel .
OpenCL error code is -44 at when clCreateKernel kernel_HistogramRectAllChannels .
OpenCL error code is -44 at when clCreateKernel kernel_HistogramRectAllChannelsReduction .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg numPixels .
OpenCL error code is -48 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -48 at when clSetKernelArg histogramBuffer .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannels .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannelsReduction .
OpenCL error code is -44 at when clCreateKernel kernel_ThresholdRectToPix .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg height .
OpenCL error code is -48 at when clSetKernelArg width .
OpenCL error code is -48 at when clSetKernelArg wpl .
OpenCL error code is -48 at when clSetKernelArg thresholdsBuffer .
OpenCL error code is -48 at when clSetKernelArg hiValuesBuffer .
OpenCL error code is -48 at when clSetKernelArg pixThBuffer .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_ThresholdRectToPix .
[DS] Device: "Tesla K40c" (OpenCL) evaluated
[DS] composeRGBPixel: 0.022510 (w=1.2)
[DS] HistogramRect: 0.006130 (w=2.4)
[DS] ThresholdRectToPix: 0.005528 (w=4.5)
[DS] getLineMasksMorph: 0.007509 (w=5.0)
[DS] Score: 0.104147

[DS] Device: "(null)" (Native) evaluation...
[DS] Device: "(null)" (Native) evaluated
[DS] composeRGBPixel: 0.029208 (w=1.2)
[DS] HistogramRect: 0.108206 (w=2.4)
[DS] ThresholdRectToPix: 0.032306 (w=4.5)
[DS] getLineMasksMorph: 0.200906 (w=5.0)
[DS] Score: 1.444651
[DS] Scores written to file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:Tesla K40c score is 0.104147
[DS] Device[2] 0:(null) score is 1.444651
[DS] Selected Device[1]: "Tesla K40c" (OpenCL)
OpenCL error code is -44 at when clCreateKernel kernel_HistogramRectOneChannel .
OpenCL error code is -44 at when clCreateKernel kernel_HistogramRectOneChannelReduction .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg numPixels .
OpenCL error code is -48 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -48 at when clSetKernelArg histogramBuffer .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannels .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannelsReduction .
OpenCL error code is -44 at when clCreateKernel kernel_ThresholdRectToPix_OneChan .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg height .
OpenCL error code is -48 at when clSetKernelArg width .
OpenCL error code is -48 at when clSetKernelArg wpl .
OpenCL error code is -48 at when clSetKernelArg thresholdsBuffer .
OpenCL error code is -48 at when clSetKernelArg hiValuesBuffer .
OpenCL error code is -48 at when clSetKernelArg pixThBuffer .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_ThresholdRectToPix .
OpenCL error code is -44 at when clCreateKernel kernel_HistogramRectOneChannel .
OpenCL error code is -44 at when clCreateKernel kernel_HistogramRectOneChannelReduction .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg numPixels .
OpenCL error code is -48 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -48 at when clSetKernelArg imageBuffer .
OpenCL error code is -48 at when clSetKernelArg tmpHistogramBuffer .
OpenCL error code is -48 at when clSetKernelArg histogramBuffer .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannels .
OpenCL error code is -48 at when clEnqueueNDRangeKernel kernel_HistogramRectAllChannelsReduction .
-bash-4.2$

_STOP_**********

Running after that I get:

_START_**********
-bash-4.2$ tesseract ESub.png ESubPng
Tesseract Open Source OCR Engine v3.04.00 with Leptonica
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:Tesla K40c score is 0.104147
[DS] Device[2] 0:(null) score is 1.444651
[DS] Selected Device[1]: "Tesla K40c" (OpenCL)
-bash-4.2$

_STOP_***********

This creates the txt file, but it's just junk, maybe a few random characters. For a long time I was also seeing the 'Error in boxClipToRectangle: box outside rectangle' sort of errors that axfelix was seeing.

If I run it with the CPU device like this:

TESSERACT_OPENCL_DEVICE=2 tesseract ESub.png ESubPng

...it works perfectly. Great OCR output.

I have done everything I can imagine, including rebuilding Tesseract several times, using jpg, png and tif input files and rebuilding Leptonica. The nvidia drivers are the latest and OpenCL is the latest.

Any ideas?

@markboettcher
Copy link

Still working on the problem. Here's some information that will hopefully make it easier for someone to solve before I spend another day tracking it down.

I have two test cases using the same source file. Doesn't matter if the source is png or tif, same bad results. Tesseract correctly identifies that I have a Tesla K40c as device 1. The CPU is device 2. The call I'm making is either:
TESSERACT_OPENCL_DEVICE=1 tesseract ESub.png Esub -l enm //GPU
or
TESSERACT_OPENCL_DEVICE=2 tesseract ESub.png Esub -l enm //CPU

In the file baseapi.cpp in function TessBaseAPI::ProcessPage() I put in code to save the image before and after the call to GetThresholdedImage() like so:

pixWrite("tessinputBEFORE.tif", pix, IFF_TIFF_G4);
Pix* page_pix = GetThresholdedImage();
pixWrite("tessinputAFTER.tif", page_pix, IFF_TIFF_G4);

In the CPU version both before and after images look correct, showing the same text and formatting.

In the GPU OpenCL version 'before' is fine, but the 'after' version is seriously sheared when the source image has non-binary dimensions. It looks like the pitch is being miscalculated. The source image was 1721 x 1238 pixels. In this case the OCR failed completely of course.

On an educated hunch I resized the the same image to 2048 x 1024. Now the shearing is gone, The characters in the image are somewhat recognizable. There is still a problem, the letters are all munged, almost like letters are being overlaid on top of each other or something. I will send the two bad output images if you tell me where.

@markboettcher
Copy link

Here you go...

Original source image 1721 x 1238:

esub

Result when source image resized to 2048 x 1024 results:

tessinputafter

@markboettcher
Copy link

Sheared OpenCL output with odd-sized source image.

skewedopenclodddims

@jbreiden
Copy link
Contributor

I swear I've seen this particular image corruption before, when I was working on the pdf output module. I wish I could remember what it was.

@axfelix
Copy link
Author

axfelix commented Jan 8, 2016

Anything else that can be done to help debug? I have a vested interest in getting a working OpenCL build before the end of January...

@jbreiden
Copy link
Contributor

jbreiden commented Jan 9, 2016

I speculate we had a Leptonica bug, fixed it, but the fix never got propagated to the OpenCL-ified mini-fork of Leptonica that sits inside Tesseract. I think the way to go is look inside the thresholding call and keep comparing the CPU vs GPU to help isolate where things go different. It's also helpful to look at all data in the pix struct before/after/cpu/gpu to understand better what went wrong.

struct Pix
{
    l_uint32             w;           /* width in pixels                   */
    l_uint32             h;           /* height in pixels                  */
    l_uint32             d;           /* depth in bits (bpp)               */
    l_uint32             spp;         /* number of samples per pixel       */
    l_uint32             wpl;         /* 32-bit words/line                 */
    l_uint32             refcount;    /* reference count (1 if no clones)  */
    l_int32              xres;        /* image res (ppi) in x direction    */
                                      /* (use 0 if unknown)                */
    l_int32              yres;        /* image res (ppi) in y direction    */
                                      /* (use 0 if unknown)                */
    l_int32              informat;    /* input file format, IFF_*          */
    l_int32              special;     /* special instructions for I/O, etc */
    char                *text;        /* text string associated with pix   */
    struct PixColormap  *colormap;    /* colormap (may be null)            */
    l_uint32            *data;        /* the image data                    */
};

I don't know/remember what this code path is trying to do. Seems odd that
we'd be writing an image to a file unless this was a pure debug path. What happens
if you comment out the thresholding call? (go ahead and comment out lines 1227 to 1232 inclusive)

pixWrite("tessinput.tif", page_pix, IFF_TIFF_G4);

@jbreiden
Copy link
Contributor

jbreiden commented Jan 9, 2016

Yeah, that sure looks like a debug path. New theory is OpenCL turns on some some debugging, which activates faulty debugging code that messes everything up.

@jbreiden
Copy link
Contributor

jbreiden commented Jan 9, 2016

Separate experiment, swap:

pixWrite("tessinputAFTER.tif", page_pix, IFF_TIFF_G4);

with

pixWrite("tessinputAFTER.png", page_pix, IFF_PNG);

just to make sure we're not having trouble with writing out your debug image.
I can imagine weird problems if the depth is not 1 bit per pixel and we try
to write out g4 tiff.

@jbreiden
Copy link
Contributor

jbreiden commented Jan 9, 2016

And finally, find the place where the image is first read into Tesseeract and turned into a Pix. See if we already have corruption at that point.

@jbreiden
Copy link
Contributor

I can't debug this myself. I still get a segfault on my computer when trying to use OpenCL as per issue #53 . Will be travelling for the rest of the month. Good luck and sorry.

@amitdo amitdo added the bug label May 27, 2016
@zdenop zdenop added the OpenCL label Sep 1, 2016
@zdenop
Copy link
Contributor

zdenop commented Oct 27, 2016

Please check master branch or commits f46dfdc and 5db7602.

@stweil
Copy link
Member

stweil commented Nov 12, 2016

PR #475 fixes an error with OpenCL which results in these error message:

Tesseract Open Source OCR Engine v3.05.00dev with Leptonica
OpenCL error code is -44 at   when clCreateKernel kernel_HistogramRectAllChannels .
OpenCL error code is -44 at   when clCreateKernel kernel_HistogramRectAllChannelsReduction .
OpenCL error code is -48 at   when clSetKernelArg imageBuffer .

Maybe it is related to this issue, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants