Searching vor numbers


(Ste Phan) #1

are there any analyzers to translate numbers to text?

I want to have the ability to find values like 20 and zwanzig (twenty) as equals.

TIA
Eutychus


(Mark Walkom) #2

You could leverage synonyms - https://www.elastic.co/guide/en/elasticsearch/reference/2.2/analysis-synonym-tokenfilter.html


(Jörg Prante) #3

There are two methods:

  • parsing text to generate numbers: "zwanzig" -> 20
  • writing numbers as text: 20 -> "zwanzig"

Should the index contain numbers or text?


(Jörg Prante) #4

I have added a "spellout" locale-based number format token filter to my customized ICU implementation, see

It translates numbers to text, and the text is indexed. So if you have a text with "20" or with "zwanzig", both are indexed as "zwanzig".

Example of a setting:

Note, the RuleBasedNumberFormat class of ICU which I use for this token filter is working in lenient mode, which is quite slow.


(Ste Phan) #5

Thank you for your help.

even if you suggestion seems to be particularly promising it is not possible to me to try it out, because the servers running the elasticsearch service are using openjava 7 :frowning:


(system) #6