Search by digits doesn't work with edge_ngram

mikhail.tin · June 4, 2018, 1:47pm

Hi there,

I am using a edge ngram tokenizer in order to provide partial matching .
My scripts:

PUT docs
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "autocomplete",
          "filter": ["lowercase"]
        },
        "autocomplete_search": {
          "tokenizer": "lowercase"
        }
      },
      "tokenizer": {
        "autocomplete": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 30,
          "token_chars": [	"letter","digit","punctuation","symbol"]
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "name": {"type": "text","analyzer": "autocomplete","search_analyzer": "autocomplete_search"}
      }
    }
  }
}

PUT docs/doc/1
{ "name" : "111"}

PUT docs/doc/2
{"name" : "first"}

PUT docs/doc/3 
{"name" : "123456789й"}

PUT docs/doc/4
{"name" : "fir123"}

So, I expect I can find my documents(3 and 4) use next query:

GET docs/_search
{
  "query": {
    "match": {
      "name": "12"
    }
  }
}

Also, I expect I can find my documents(2 and 4) use next query:

GET docs/_search
{
  "query": {
    "match": {
      "name": "Fi"
    }
  }
}

What could be wrong?

Thanks!

dadoonet · June 4, 2018, 7:12pm

Have a look at what your text is transformed at search time:

POST docs/_analyze
{
  "analyzer": "autocomplete_search",
  "text": [ "123456789й" ]
}

{
  "tokens": [
    {
      "token": "й",
      "start_offset": 9,
      "end_offset": 10,
      "type": "word",
      "position": 0
    }
  ]
}

Have a look at the Doc.

mikhail.tin · June 5, 2018, 7:00am

Hi, Thanks for your answer. I have found solution:

I have set up my analyzer like that:

"autocomplete_search" : {
          "type" : "custom",
          "tokenizer": "keyword",
          "filter": "lowercase"
        }

system · July 3, 2018, 7:00am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Edge Ngram not working with digits Elasticsearch	1	1109	July 12, 2017
Edge NGram Tokenizer not Tokenizing Digits? Elasticsearch	3	1021	September 20, 2019
Edge ngram for numeric value Elasticsearch	1	280	June 21, 2021
Issue with Edge NGram Tokenizer in elastic search Elasticsearch	2	649	January 13, 2017
Edge Ngram not working on querying all fields Elasticsearch	1	621	July 4, 2017

Search by digits doesn't work with edge_ngram

Related topics