Sort and icu problem


(Weiwei Wang) #1

in order to sort on chinese, i used keworkd-tokenizer and icu_collator
filter, however, i also need to do facet on the same field, the
problem now comes that the facet result is very not friendly as i can
not read it.

besides i aslo want to search on this field, if i passed the result by
icu_collation to the query string, es will complains query parser
failing.

i want to know why localized sort not supported and keep the original
input not encoded by icu_collator?


#2

On Thu, Jun 16, 2011 at 9:11 AM, Weiwei Wang ww.wang.cs@gmail.com wrote:

in order to sort on chinese, i used keworkd-tokenizer and icu_collator
filter, however, i also need to do facet on the same field, the
problem now comes that the facet result is very not friendly as i can
not read it.

besides i aslo want to search on this field, if i passed the result by
icu_collation to the query string, es will complains query parser
failing.

i want to know why localized sort not supported and keep the original
input not encoded by icu_collator?

Maybe ES needs to hide more from you, but at the low level you need 2 fields:

  1. the original text for faceting (keywordtokenizer)
  2. the sort key field for sorting (collated)

the sort key field is really not useful for anything but sorting. its
a binary sort key. this is the same way it works with all databases
too (they just hide the process from you)


(system) #3