Elasticsearch Kuromoji plugin

What is the expected output when we run :

PUT test
{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "kuromoji_number": {
            "type": "kuromoji_number"
          },
          "kuromoji_readingform": {
            "type": "kuromoji_readingform"
          }
        },
        "tokenizer": {
          "kuromoji": {
            "type": "kuromoji_tokenizer"
          }
        }
      }
    }
  }
}
GET /test/_analyze
{
  "text": "一〇〇〇",
  "tokenizer": "kuromoji",
  "filter": [
    "kuromoji_number",
    "kuromoji_readingform"
  ]
}

should the output be like :

{
  "tokens": [
    {
      "token": "一",
      "number": 1,
      "reading_form": "ichi"
    },
    {
      "token": "〇",
      "number": 0,
      "reading_form": "zero"
    },
    {
      "token": "〇",
      "number": 0,
      "reading_form": "zero"
    },
    {
      "token": "〇",
      "number": 0,
      "reading_form": "zero"
    }
  ]
}

or like this

{
  "tokens" : [
    {
      "token" : "〇",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "word",
      "position" : 0
    }
  ]
}

How to understand how the plugin would work in case of 2 filters.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.