假设我又这样一个索引:
PUT myindex
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer",
          "filter": [
            "lowercase",
            "my_stemmer"
          ]
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "pattern",
          "pattern": "[;]+"
        }
      },
      "filter": {
        "my_stemmer": {
          "type": "stemmer",
          "name": "english"
        }
      }
    }
  }
}
测试分析器:
GET /myindex/_analyze
{
  "analyzer": "my_analyzer",
  "text": "running dates;Sex health education;Perceptions towards Sexual Health Education"
}
需要将running dates;Sex health education;Perceptions towards Sexual Health Education按分号分词,然后在对其进行词形还原,预期结果应该是:
{
  "tokens": [
    {
      "token": "run date",
      "start_offset": 0,
      "end_offset": 13,
      "type": "word",
      "position": 0
    },
    {
      "token": "sex health educ",
      "start_offset": 14,
      "end_offset": 34,
      "type": "word",
      "position": 1
    },
    {
      "token": "percept toward sexual health educ",
      "start_offset": 35,
      "end_offset": 78,
      "type": "word",
      "position": 2
    }
  ]
}
然而实际结果却是这样:
{
  "tokens": [
    {
      "token": "running d",
      "start_offset": 0,
      "end_offset": 13,
      "type": "word",
      "position": 0
    },
    {
      "token": "sex health educ",
      "start_offset": 14,
      "end_offset": 34,
      "type": "word",
      "position": 1
    },
    {
      "token": "perceptions towards sexual health educ",
      "start_offset": 35,
      "end_offset": 78,
      "type": "word",
      "position": 2
    }
  ]
}
该如何实现我的需求?