_analyse field: which analyzer will be used on search?


(Sébastien Lorber) #1

Hello,

I have a question about the _analyze field
http://www.elasticsearch.org/guide/reference/mapping/analyzer-field.html

This seems pretty cool to handle documents of multiple languages on a
single index, by detecting first the document language.

However i often read that we should use the same analyzer for indexing and
searching.
But in the case of a search of a string text, how can ElasticSearch know
which search analyzer to use?

Should we handle that ourselves when building our query? (quite a pain)
Or perhaps ES is doing some magical stuff like applying all the analyzers
known for that field and creates automatically a boolean query with all
these analyzers?
Please tell me :slight_smile:

Btw is it possible to put the _analyzer field only for a specific field
instead of declaring it directly for a type?
For exemple if i know by statistics that my user is posting 80% of
documents in english, and 20% in french, i would like to have a multi_field
which define 3 subfields "untouched" "preferedLang1" and "preferedLang2"
Is it possible do do such a thing?

Thanks


(Ivan Brusic) #2

It is possible to specify the analyzer for each field. In fact, it is
the normal way to use analyzers. There is also the _all analyzer,
which would be used by default if you do not specify a field.

Your use of multi-field is correct and is a perfect use case.

--
Ivan

On Thu, Jul 5, 2012 at 9:07 AM, Sébastien Lorber
lorber.sebastien@gmail.com wrote:

Btw is it possible to put the _analyzer field only for a specific field
instead of declaring it directly for a type?
For exemple if i know by statistics that my user is posting 80% of documents
in english, and 20% in french, i would like to have a multi_field which
define 3 subfields "untouched" "preferedLang1" and "preferedLang2"
Is it possible do do such a thing?

Thanks


(Sébastien Lorber) #3

Sorry i don't talk about the "analyzer" field of properties, but the
_analyzer field which seems to be configured on the type only in the
documentation.

It's not the same, here i'm trying to index the same field (even a subfield
of a multifield) with multiple analyzers, according to the document
country, instead to have a big multi_field with each subfield using a
country specific analyzer.

Le vendredi 6 juillet 2012 23:14:54 UTC+2, Ivan Brusic a écrit :

It is possible to specify the analyzer for each field. In fact, it is
the normal way to use analyzers. There is also the _all analyzer,
which would be used by default if you do not specify a field.

Your use of multi-field is correct and is a perfect use case.

--
Ivan

On Thu, Jul 5, 2012 at 9:07 AM, Sébastien Lorber
lorber.sebastien@gmail.com wrote:

Btw is it possible to put the _analyzer field only for a specific field
instead of declaring it directly for a type?
For exemple if i know by statistics that my user is posting 80% of
documents
in english, and 20% in french, i would like to have a multi_field which
define 3 subfields "untouched" "preferedLang1" and "preferedLang2"
Is it possible do do such a thing?

Thanks


(system) #4