OCR support for ES Mapper attachments plugin

(Eyal) #1

Hi there,
The following case was reopened due to my request (see below).
Could someone provide an update about the subject?

Recent Apache Tika (http://tika.apache.org/1.7/index.html) and up is now supporting OCR which depends on the pre-installation of Tesseract (https://code.google.com/p/tesseract-ocr).
According to what I checked elasticsearch Mapper Attachments (https://github.com/elastic/elasticsearch-mapper-attachments#mapper-attachments-type-for-elasticsearch) is supporting the same Tika library as mentioned above.
So, is it just exposing the API's??
Would love to hear what do you guys think as this is a really cool feature!

