Documents with german umlauts

I have two documents:

  1. {"name": "Drucker"}
  2. {"name": "Drücker"}

How I should index it and how the query should be build so I can:

a) find both documents querying for "drucker"
b) sort the documents according to the search query (the searched document should appear before the others)

Regards,
Wojciech

1 Like

Using an asciifolding token filter would probably help here.

See https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-asciifolding-tokenfilter.html

The asciifolding filter will normalize the extended characters so that
those words are equivalent. It was solve your first case, but not the
second. ICU collation might help with the latter, but sorting would be
language specific and not based on the query:

https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-collation-keyword-field.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.