Sorting on keyword field with accents

Hi there,

My index has a keyword field on which there are documents that contain words with accents, i.e., 'Águas Lindas', and some don't, i.e., 'Aracaju'.

I'm using that field to sort my results. The problem is that, to asc sorting, the documents on which this field has accents, are being returned on the last positions, i.e.:

doc1: {
  my_text_field.keyword: 'Aracaju'
},
doc_2: {
  my_text_field.keyword: 'Belo Horizonte'
}
doc_3: {
  my_text_field.keyword: 'Águas Lindas'
}

Is it possible to make elastic sorting ignore the accents, as long as I can't set an analyzer with asciifolding filter to my keyword field?

Or, for this use case, I have to use a text field with an anayzer and a fielddata=true. I was avoiding to use that, because the performance issues related to fielddata.

Does anybody know what is the best solution for me?

Thanks,

Guilherme

You can indeed not apply an analyzer to a keyword field, but you can apply a normalizer. Think of a normalizer like an analyzer, but then for keyword fields (with some restrictions).

The example in our documentation shows exactly your use case: using a normalizer to apply ASCII folding to a keyword field, so you can use that keyword field for sorting.

1 Like

Great, @abdon! It worked.

Thank you very much!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.