How can I reflect the results of kuromoji in the search results?

111464 · April 18, 2021, 12:19am

I'm applying a standard analyzer and a kuromoji analyzer to one field.
The version of ElasticSearch is 6.3.0.

PUT /test
{
  "mappings": {
    "docs": {
      "properties": {
        "body": {
          "type": "text",
          "fields": {
            "japanese_field": {
              "analyzer": "kuromoji",
              "type": "text"
            }
          }
        }
      }
    }
  }
}

Add the following record as an example.

PUT /test/docs/1
{
  "body" : "和田の本棚"
}

PUT /test/docs/2
{
  "body" : "令和元年"
}

query:

POST /test/_search
{
  "query": {
    "multi_match": {
      "query": "和田の日記",
      "fields": [
        "body.japanese_field",
        "body"
      ]
    }
  }
}

result:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.8630463,
    "hits": [
      {
        "_index": "test",
        "_type": "docs",
        "_id": "1",
        "_score": 0.8630463,
        "_source": {
          "body": "和田の本棚が検出されました"
        }
      },
      {
        "_index": "test",
        "_type": "docs",
        "_id": "2",
        "_score": 0.2876821,
        "_source": {
          "body": "令和元年"
        }
      }
    ]
  }
}

question

I don't want to hit result of "令和元年".
Is there a way to prioritize the analyzer results of body.japanese_field and display them in the search results?

GET /test/_analyze
{
  "field": "body.japanese_field",
  "text": "和田の本棚"
}


{
  "tokens": [
    {
      "token": "和田",
      "start_offset": 0,
      "end_offset": 2,
      "type": "word",
      "position": 0
    },
    {
      "token": "本棚",
      "start_offset": 3,
      "end_offset": 5,
      "type": "word",
      "position": 2
    }
  ]
}

Reference information

GET /test/_analyze
{
  "field": "body",
  "text": "和田の本棚"
}

{
  "tokens": [
    {
      "token": "和",
      "start_offset": 0,
      "end_offset": 1,
      "type": "<IDEOGRAPHIC>",
      "position": 0
    },
    {
      "token": "田",
      "start_offset": 1,
      "end_offset": 2,
      "type": "<IDEOGRAPHIC>",
      "position": 1
    },
    {
      "token": "の",
      "start_offset": 2,
      "end_offset": 3,
      "type": "<HIRAGANA>",
      "position": 2
    },
    {
      "token": "本",
      "start_offset": 3,
      "end_offset": 4,
      "type": "<IDEOGRAPHIC>",
      "position": 3
    },
    {
      "token": "棚",
      "start_offset": 4,
      "end_offset": 5,
      "type": "<IDEOGRAPHIC>",
      "position": 4
    }
  ]
}

system · May 16, 2021, 12:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need Help with Japanese analyzer - (Kuromoji) Elasticsearch	1	363	July 6, 2017
How to set a analyser for all the fields and the "_all" field Elasticsearch	5	1848	July 5, 2017
Elasticsearch Kuromoji plugin Elasticsearch	1	167	June 22, 2023
(Plugin Kuromoji) Can you help me resolve config elasticsearch.yml create analyzer? 日本語による質問・議論はこちら	5	1664	July 6, 2017
[ANN] Japanese (kuromoji) Analysis plugin for elasticsearch 1.8.0 released Elasticsearch	1	303	July 6, 2017

How can I reflect the results of kuromoji in the search results?

question

Reference information

Related Topics