[analysis] Kuromoji: can't analaze text with Half-width space in user dictionary

cross1154 · June 1, 2022, 9:19am

I try to use kuromoji user dictionary to make Elasticsearch can analazed japanese name which with
Half-width space between family name & first name.
Settings example looks like the below:

{
  "settings": {
    "analysis": {
      "tokenizer": {
        "kuromoji_user_dict": {
          "type": "kuromoji_tokenizer",
          "mode": "normal",
          "user_dictionary_rules": [
              "渡辺 健,渡辺 健,ワタナベ ヒカリ,カスタム名詞",
            ]
        }
      },
      "analyzer": {
        "my_ja_analyzer": {
          "type":      "custom",
          "tokenizer": "kuromoji_user_dict",
            "char_filter": [
              "icu_normalizer"
            ]
        }
      }
    }
  }
}

But I fund it dosen't worked well.A part of name was not analazed .

{
  "analyzer": "my_ja_analyzer",
  "text": "渡辺 健"
}
{
    "tokens": [
        {
            "token": "渡辺",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 0
        },
        {
            "token": " ",
            "start_offset": 2,
            "end_offset": 3,
            "type": "word",
            "position": 1
        }
    ]
}

Can somebody tells me what's wrong with it?

system · June 29, 2022, 9:20am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use user dictionary on elasticsearch / elasticsearch-analysis-kuromoji Elasticsearch	3	1738	July 6, 2017
Elasticsearch mapping Elasticsearch	1	277	July 27, 2021
Special Character Search with kuromoji analyzer Elasticsearch	1	440	August 31, 2018
Kuromojiユーザ辞書に定義済みの単語で構成された複合語の形態素解析について日本語による質問・議論はこちら	3	3834	November 1, 2021
(Plugin Kuromoji) Can you help me resolve config elasticsearch.yml create analyzer? 日本語による質問・議論はこちら	5	1664	July 6, 2017

[analysis] Kuromoji: can't analaze text with Half-width space in user dictionary

Related topics