Specifying analyzer on a per field basis at index time

barnybug · March 6, 2012, 9:17pm

Hi,

Thanks for the response.

Currently we're indexing a set of documents in different languages and
using _analyzer mapping to determine the per doc stemming analyzer.

What we'd like to do is index some fields of the documents both stemmed and
unstemmed (eg. english analyzer to produce stemmed English and 'standard'
analyzer to produce unstemmed). So using a multi_field seems applicable,
but then the two analyzers are fixed. Kind of need to specify two _analyzer
fields.

Essentially the customer wants to be able to do both stemmed (language
specific) searches and unstemmed (general) searches. This comes down to a
requirement to be able to match names, proper nouns, etc in cases where
stemming may interfere but there's no definitive list of these terms that
should not be stemmed.

We considered an index per language but it's quite a high number of
languages we're dealing so would likely be too many indexes.
Using a field per language also presents issues - to do the general
unstemmed searches would require querying across many fields.

Alternatively we were considering if it'd be easy to develop a tokenizer
that wrapped existing stemming tokenizers but also produced the original
term in addition to the stemmed term.

Sorry if that makes less than perfect sense!

thanks,

Barnaby

On Tuesday, 6 March 2012 20:29:57 UTC, kimchy wrote:

No, you can't specify it per field, though why do you want it? Usually,
having a different analyzer for each document does't make a lot of sense.
Usually, it makes more sense to have different fields.

On Tuesday, March 6, 2012 at 6:01 PM, barnybug wrote:

I understand you can specify the analyzer per document at index time
using the _analyzer field in mapping, but is it possible to specify it
in the same way but per field at index time?

Or if not currently possible, how easy to add (happy to have a crack
at it myself)?

thanks

Barnaby

Topic		Replies	Views
Specifying analyzer language at insert time Elasticsearch	6	495	July 6, 2017
Specific analyzer per document Elasticsearch	4	455	July 6, 2017
_analyse field: which analyzer will be used on search? Elasticsearch	3	392	July 6, 2017
Query analzyer with respect to field/index analzyer Elasticsearch	5	384	July 6, 2017
Using a different analyzer for each query and same index Elasticsearch	3	409	July 6, 2017

Specifying analyzer on a per field basis at index time

Related topics