Adding accent folding to English analyzer

nik9000 · August 6, 2013, 6:49pm

Right now I use the English analyzer for English text. If I wanted to
enable accent folding would I have to recreate the whole analyzer like this:
{
'type': 'custom',
'tokenizer': 'standard',
'filter': [ 'standard', 'english_possessive', 'lowercase',
'stop', 'porter_stem', 'icu_folding' ]
}

Assuming that is what I have to do, it looks like english_possessive filter
is not something I can create via the api. Should I be doing something
else?

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · August 6, 2013, 7:13pm

The filter's name is 'possessive_english'

See https://github.com/elasticsearch/elasticsearch/issues/908

Note, if you analyze english only, there is no need for ICU because there
is a filter 'asciifolding'
http://www.elasticsearch.org/guide/reference/index-modules/analysis/asciifolding-tokenfilter/

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

simonw_2 · August 6, 2013, 7:18pm

Yeah the english possessive filter seems not to be exposed. I willl create
and issue and add it.

simon

On Tuesday, August 6, 2013 8:49:00 PM UTC+2, Nikolas Everett wrote:

Right now I use the English analyzer for English text. If I wanted to
enable accent folding would I have to recreate the whole analyzer like this:
{
'type': 'custom',
'tokenizer': 'standard',
'filter': [ 'standard', 'english_possessive', 'lowercase',
'stop', 'porter_stem', 'icu_folding' ]
}

Assuming that is what I have to do, it looks like english_possessive
filter is not something I can create via the api. Should I be doing
something else?

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

simonw_2 · August 6, 2013, 7:19pm

nevermind Joerg is right!

simon

On Tuesday, August 6, 2013 9:18:45 PM UTC+2, simonw wrote:

Yeah the english possessive filter seems not to be exposed. I willl create
and issue and add it.

simon

On Tuesday, August 6, 2013 8:49:00 PM UTC+2, Nikolas Everett wrote:

Right now I use the English analyzer for English text. If I wanted to
enable accent folding would I have to recreate the whole analyzer like this:
{
'type': 'custom',
'tokenizer': 'standard',
'filter': [ 'standard', 'english_possessive',
'lowercase', 'stop', 'porter_stem', 'icu_folding' ]
}

Assuming that is what I have to do, it looks like english_possessive
filter is not something I can create via the api. Should I be doing
something else?

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Lang (czech) analyzer with asciifolding tokenizer or icu_tokenizer Elasticsearch	10	1141	July 6, 2017
Question about asciifolding filter Elasticsearch	3	549	July 6, 2017
Indexing non-English text Elasticsearch	11	2729	July 6, 2017
Convert English to accents and then search Elasticsearch	2	423	July 26, 2017
Problems with Ascii Folding text with Accents Elasticsearch	4	1646	July 5, 2017

Adding accent folding to English analyzer

Related topics