What is extractable from an image using the attachment plugin?

Hi,

I am trying to work out the limits of what is possible with the attachment
plugin for elasticsearch. Working with document based attachments is
absolutely fine and I have no problem with that. However, I would like to
know what exactly can I extract from image attachments? Other than image
metadata (e.g. EXIF) can I expect the plugin to perform a sort of OCR
function, and recognise when an image contains text and then make that text
searchable?

I've tried it locally, and it didn't work, but I wanted to ask other users
in case they've successfully been able to get this to work, and I'm doing
something wrong.

Thanks!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hey,

the attachment plugin does not use any OCR recognition. It uses Apache Tika
for content extraction. See http://tika.apache.org/

--Alex

On Thu, Jul 25, 2013 at 5:46 PM, Kashif Nasir kashif.nasir@gmail.comwrote:

Hi,

I am trying to work out the limits of what is possible with the attachment
plugin for elasticsearch. Working with document based attachments is
absolutely fine and I have no problem with that. However, I would like to
know what exactly can I extract from image attachments? Other than image
metadata (e.g. EXIF) can I expect the plugin to perform a sort of OCR
function, and recognise when an image contains text and then make that text
searchable?

I've tried it locally, and it didn't work, but I wanted to ask other users
in case they've successfully been able to get this to work, and I'm doing
something wrong.

Thanks!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Relative to this discussion: Adding OCR support · Issue #10 · elastic/elasticsearch-mapper-attachments · GitHub

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 26 juil. 2013 à 08:05, Alexander Reelsen alr@spinscale.de a écrit :

Hey,

the attachment plugin does not use any OCR recognition. It uses Apache Tika for content extraction. See http://tika.apache.org/

--Alex

On Thu, Jul 25, 2013 at 5:46 PM, Kashif Nasir kashif.nasir@gmail.com wrote:

Hi,

I am trying to work out the limits of what is possible with the attachment plugin for elasticsearch. Working with document based attachments is absolutely fine and I have no problem with that. However, I would like to know what exactly can I extract from image attachments? Other than image metadata (e.g. EXIF) can I expect the plugin to perform a sort of OCR function, and recognise when an image contains text and then make that text searchable?

I've tried it locally, and it didn't work, but I wanted to ask other users in case they've successfully been able to get this to work, and I'm doing something wrong.

Thanks!

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.