ES 7.12 - Different analyzers are getting used for indexing and searching

Hello, I have an interesting issue to report for ES 7.12. For one of the dynamically indexed field the analyzer used for indexing is "ngram" but for searching its "standard". I am not sure if this is a bug or an intended change. This started happening since ES 7.10.

Below I am showing a simple mapping with one field and different analyzers. Below steps performed on 7.9 and 7.12 have different results. 7.9 gives a hit while 7.12 doesn't, reason being
use of standard analyzer while searching.
Steps to reproduce:

  1. Create an Index with mapping.
     PUT - http://localhost:9200/{indexname}
    {

        "mappings": {

                "properties": {

                  "category": {

                        "properties": {

                            "id": {

                                "type": "long"

                            },

                            "name": {

                                "type": "keyword",

                                "normalizer": "lowercase_normalizer",

                                "fields": {

                                    "analyzed": {

                                        "type": "text",

                                        "analyzer": "content_asset_engram_analyzer"

                                    }

                                }

                            },

                            "parentId": {

                                "type": "long"

                            }

                        }

            }

            }

            },

            "settings": {

                "index": {

                    "max_ngram_diff": "3",

                    "refresh_interval": "30s",

                    "number_of_shards": "1",

                    "analysis": {

                        "normalizer": {

                            "lowercase_normalizer": {

                                "filter": [ "lowercase" ],

                                "type": "custom"

                            }

                        },

                        "analyzer": {

                            "default_search": {

                                "filter": [ "lowercase" ],

                                "type": "custom",

                                "tokenizer": "standard"

                            },

                            "content_asset_html_strip": {

                                "filter": [ "lowercase" ],

                                "type": "custom",

                                "char_filter": [ "html_strip" ],

                                "tokenizer": "content_asset_engram_tokenizer"

                            },

                            "content_asset_engram_analyzer": {

                                "filter": [ "lowercase" ],

                                "type": "custom",

                                "tokenizer": "content_asset_engram_tokenizer"

                            }

                        },

                        "tokenizer": {

                            "content_asset_engram_tokenizer": {

                                "token_chars": [ "letter", "digit" ],

                                "min_gram": "3",

                                "type": "ngram",

                                "max_gram": "6"

                            }

                        }

                    }

                }

            }
    }
  1. Add document to the index
    POST http://localhost:9200/{indexname}/_doc
    {

        "category": {

                "id": 3962,

                "name": "Content Builder",
                "parentId": 0

            }
    }
  1. Search
    GET http://localhost:9200/{indexname}/_search
    {

      "query": {

        "match": {

          "category.name.analyzed": "content"

        }

      }
    }

Welcome to our community! :smiley: And thanks heaps for providing a replica here, it's super helpful!

I just ran this example on 7.9.3 and 7.12.1 and they both returned zero hits?

thank you for your response.
That is not my experience, I am getting a hit on 7.9.3. This is interesting though, should the search not return a hit on both versions? What does the response for explain api looks like

GET http://localhost:9200/{indexName}/_doc/{docId}/_explain 
{
    "query": {
        "match": {
            "category.name.analyzed": "content"
        }
    }
}

if that doesn't return anything, you could run explain on smaller words like "conten" and see if ngram analyzer was used during search.

Mark Walkom, were you able to find out why search is not returning result for you in any of versions ? was explain api of any help ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.