Index image files with OCR

i want to OCR image files and index them in ElasticSearch, i want to be able to highlight sections of the images with respect to search it possible to do with the ingest plugin ? i know that apache tika is used but wasnt sure if OCR of images was supported.


I don't know if Tikka can do that, but you'd need to use something like it.

OCR works in Tika when Tesseract is available. But ingest-attachment does not work with Tesseract I think.

You can look at FSCrawler project which is supposed to work with Tesseract although I know there is an open issue about this.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.