Romanization keyword sorting (specifically Pinyin)

Hi,

Is sorting by Romanization, specifically Pinyin supported currently?

It should be sorted along other languages with Roman alphabet, so the sorting results should be in an order like this:

  1. a,
  2. b,
  3. C, // ignore case
  4. j,
  5. jA,
  6. Jb,
  7. JC,
  8. 姐 (Jiě), // consider the transliteration
  9. s,
  10. 石 (Shí),
  11. x,
  12. 小 (Xiǎo),
  13. z

For now a normalizer with lowercase and asciifolding filters is used for the text keyword:

PUT test
{
  "settings": {
    "analysis": {
      "normalizer": {
        "my_normalizer": {
          "type": "custom",
          "char_filter": [],
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword",
            "normalizer": "my_normalizer",
            "ignore_above" : 256
          }
        }
      }
    }
  }
}

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.