Treat diacritic letters (in my case polish diacritic letters ą, ć,
ę, ł, ń, ó, ś, ź, ż) as US alphabet equivalents during search:
ą -> a
ć -> c
..
What I mean is when I do search with pattern 'abc' i want to see in
results 'abc' as well as 'ąbc', but when I search for 'ąbc' I want ES
to find only 'ąbc'
This settings do not work, I expected asciifolding might be doing the
trick:
index.analysis.analyzer.default.type: standard
index.analysis.analyzer.default.stopwords: none
index.analysis.analyzer.default.tokenizer: standard
index.analysis.analyzer.default.filter: [standard, lowercase, stop,
asciifolding, porter_stem]
On Wednesday, February 22, 2012 at 1:49 PM, Michal Wegorek wrote:
Is there an index analyzer setting to:
Treat diacritic letters (in my case polish diacritic letters ą, ć,
ę, ł, ń, ó, ś, ź, ż) as US alphabet equivalents during search:
ą -> a
ć -> c
..
What I mean is when I do search with pattern 'abc' i want to see in
results 'abc' as well as 'ąbc', but when I search for 'ąbc' I want ES
to find only 'ąbc'
This settings do not work, I expected asciifolding might be doing the
trick:
index.analysis.analyzer.default.type: standard
index.analysis.analyzer.default.stopwords: none
index.analysis.analyzer.default.tokenizer: standard
index.analysis.analyzer.default.filter: [standard, lowercase, stop,
asciifolding, porter_stem]
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.