Attachment field questions and some more

Iv_Igi · January 24, 2014, 4:25am

Hello there! I have some questions about Attachment field, results
highlighting and suggesting.

Is the Attachment field storing whole file or extracted text only by
default?
If it stores the whole file is there any way to make it store only
extracted text? Or should I extract it with Apache Tika first and then put
it to ES storage for that purpose?
Is it possible to set min and max for number of words that will be
displayed in highlight field?
And almost the same for suggestions. Can I set up phrase
suggestioning? I.e. returning "Green grass", "Green logo", etc. for "Green"
suggestion query.

Regards, Iv Igi.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8ef29407-5c30-4d3b-bf3e-b968c82dd9eb%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · January 27, 2014, 10:24am

Hey,

the attachment stuff, is on the top of my head only.

The original document is stored as base64 inside of the source. which is
also stored on indexing. The field itself is not stored iirc.
You could exclude it from being stored in the source. Using tika as a
preprocessing step is another issue, but might be a good idea, as it does
not require you to restart your whole cluster, in case you would want to
update your tika version for example.
You can configure the fragment_size, which should help you there, see
Elasticsearch Platform — Find real-time answers at scale | Elastic
Elasticsearch has three different suggester implementations for
different use-cases. There is a specific phrase suggester, but maybe your
use-case is actually using the completion suggester, see

Elasticsearch Platform — Find real-time answers at scale | Elastic (blog post about the
completion suggester).

Hope this helps.

--Alex

On Fri, Jan 24, 2014 at 5:25 AM, Iv Igi sayoneas@gmail.com wrote:

Hello there! I have some questions about Attachment field, results
highlighting and suggesting.

Is the Attachment field storing whole file or extracted text only
by default?

If it stores the whole file is there any way to make it store only
extracted text? Or should I extract it with Apache Tika first and then put
it to ES storage for that purpose?

Is it possible to set min and max for number of words that will be
displayed in highlight field?

And almost the same for suggestions. Can I set up phrase
suggestioning? I.e. returning "Green grass", "Green logo", etc. for "Green"
suggestion query.

Regards, Iv Igi.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8ef29407-5c30-4d3b-bf3e-b968c82dd9eb%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_z97GG4RbbN1S6%3D1R96PGM7Ec%2Be3chTbBa7pMZR8w87w%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Problem in fetching text from an attachment Elasticsearch	19	2152	July 5, 2017
Autocomplete for attachment field Elasticsearch	4	459	July 6, 2017
Highlighting attachments Elasticsearch	3	410	July 6, 2017
Extracted text visibility from a Tika-processed attachment Elasticsearch	1	607	July 6, 2017
Search problem within the attachment type field Elasticsearch	8	948	July 5, 2017

Attachment field questions and some more

Related topics