Skip to content

Correct recognized OCR data missing in search index #473

Open
@playbackandrewind

Description

I have some .pdf files where the OCR recognition in graphics works perfectly, and the recognized text is also displayed correctly in the search results in the OCR tab, but I cannot find this text or its contents in the search index itself.

Does anyone have an idea why the OCR text does not appear in the search index?

The extracted text tab only contains very poorly recognized text, e.g.
"tems Ltg Am Rohiance 3 5S300 WetterCar"
"Invoice 12345 6 AV"

In the OCR tab the text is correctly recognized:
"Car Systems Ltd Am Rohlande 3 58300 Wetter"
"Invoice 123456 /W"

A search for "123456", for example, returns no results. I'm a bit at a loss right now.

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions