Why word_delimiter doesn's work on my index?

Hi all!
I need to search my products by codes like this one: "50-14Lx". I use word_delimiter to tokenize "50-14Lx" to "50","14","Lx". But it doesn't work on my index.

Index:

    PUT /products
    {
      "settings": {
        "analysis": {
          "analyzer": {
            "my_analyzer": {
              "tokenizer": "whitespace",
              "filter": [ "word_delimiter" ]
            }
          }
        }
      }
    }

Analyze:

    GET /_analyze
    {
      "tokenizer": "whitespace",
      "filter": [ "word_delimiter_graph" ],
      "text": "50-14Lx"
    }

It's OK, as expected returns: "50", "14", "Lx"

But my index returns "wrong" results "50", "14Lx":

    GET /products/_analyze
    {
      "explain": true, 
      "text": "50-14Lx"
    }

Question: why "14Lx" was not split inside my index ?

Try this

GET /products/_analyze
{
  "explain": true,
  "analyzer": "my_analyzer",
  "text": "50-14Lx"
}

Yes, this variant returns expacted results. All three tokens.

But why searching doesn't work?

POST /_bulk
{"index":{"_index":"products","_id":"1"}}
{"name": "50-14Lx"}
{"index":{"_index":"products","_id":"2"}}
{"name": "60-33Xp"}

Try to search by part of product name "14":

GET /products/_search
{
  "query": {
    "match": {
      "name": "14"
    }
  },
  "_source": ["id", "name"]
}

returns empty set...

You have to define a mpping to your index and use the custom analyzer for field name

PUT /products
{
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_analyzer",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  },
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "whitespace",
          "filter": [
            "word_delimiter"
          ]
        }
      }
    }
  }
}



POST /_bulk
{"index":{"_index":"products","_id":"1"}}
{"name": "50-14Lx"}
{"index":{"_index":"products","_id":"2"}}
{"name": "60-33Xp"}



GET /products/_search
{
  "query": {
    "match": {
      "name": "14"
    }
  },
  "_source": [
    "id",
    "name"
  ]
}

Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.