ES 7.12 - Different analyzers are getting used for indexing and searching

justramesh2000 · May 5, 2021, 1:03pm

Hello, I have an interesting issue to report for ES 7.12. For one of the dynamically indexed field the analyzer used for indexing is "ngram" but for searching its "standard". I am not sure if this is a bug or an intended change. This started happening since ES 7.10.

Below I am showing a simple mapping with one field and different analyzers. Below steps performed on 7.9 and 7.12 have different results. 7.9 gives a hit while 7.12 doesn't, reason being
use of standard analyzer while searching.
Steps to reproduce:

Create an Index with mapping.

     PUT - http://localhost:9200/{indexname}
    {

        "mappings": {

                "properties": {

                  "category": {

                        "properties": {

                            "id": {

                                "type": "long"

                            },

                            "name": {

                                "type": "keyword",

                                "normalizer": "lowercase_normalizer",

                                "fields": {

                                    "analyzed": {

                                        "type": "text",

                                        "analyzer": "content_asset_engram_analyzer"

                                    }

                                }

                            },

                            "parentId": {

                                "type": "long"

                            }

                        }

            }

            }

            },

            "settings": {

                "index": {

                    "max_ngram_diff": "3",

                    "refresh_interval": "30s",

                    "number_of_shards": "1",

                    "analysis": {

                        "normalizer": {

                            "lowercase_normalizer": {

                                "filter": [ "lowercase" ],

                                "type": "custom"

                            }

                        },

                        "analyzer": {

                            "default_search": {

                                "filter": [ "lowercase" ],

                                "type": "custom",

                                "tokenizer": "standard"

                            },

                            "content_asset_html_strip": {

                                "filter": [ "lowercase" ],

                                "type": "custom",

                                "char_filter": [ "html_strip" ],

                                "tokenizer": "content_asset_engram_tokenizer"

                            },

                            "content_asset_engram_analyzer": {

                                "filter": [ "lowercase" ],

                                "type": "custom",

                                "tokenizer": "content_asset_engram_tokenizer"

                            }

                        },

                        "tokenizer": {

                            "content_asset_engram_tokenizer": {

                                "token_chars": [ "letter", "digit" ],

                                "min_gram": "3",

                                "type": "ngram",

                                "max_gram": "6"

                            }

                        }

                    }

                }

            }
    }

Add document to the index

    POST http://localhost:9200/{indexname}/_doc
    {

        "category": {

                "id": 3962,

                "name": "Content Builder",
                "parentId": 0

            }
    }

Search

    GET http://localhost:9200/{indexname}/_search
    {

      "query": {

        "match": {

          "category.name.analyzed": "content"

        }

      }
    }

warkolm · May 6, 2021, 12:53am

Welcome to our community! And thanks heaps for providing a replica here, it's super helpful!

I just ran this example on 7.9.3 and 7.12.1 and they both returned zero hits?

justramesh2000 · May 6, 2021, 1:41pm

thank you for your response.
That is not my experience, I am getting a hit on 7.9.3. This is interesting though, should the search not return a hit on both versions? What does the response for explain api looks like

GET http://localhost:9200/{indexName}/_doc/{docId}/_explain 
{
    "query": {
        "match": {
            "category.name.analyzed": "content"
        }
    }
}

if that doesn't return anything, you could run explain on smaller words like "conten" and see if ngram analyzer was used during search.

justramesh2000 · May 11, 2021, 6:23pm

Mark Walkom, were you able to find out why search is not returning result for you in any of versions ? was explain api of any help ?

system · June 8, 2021, 6:23pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Help with ngram analyzer after migrating to ES 1.5 Elasticsearch	1	338	July 6, 2017
Query analzyer with respect to field/index analzyer Elasticsearch	5	346	July 6, 2017
Problem about index_analyzer and search_analyzer Elasticsearch	1	328	July 6, 2017
Upgrading from ES 1.7 to 5.4; default mapping analyzer Elasticsearch	5	511	August 25, 2017
Analyzer change in behaviour in 0.16 - bug? feature? Elasticsearch	2	302	July 6, 2017

ES 7.12 - Different analyzers are getting used for indexing and searching

Related topics