Why elasticseach 5.1 bool filter query runs slower than elasticsearch 1.5 filtered query?

Following is my elastic search - 1.5 Query:

{
    "_source":["_id","spotlight"],
    "query":{
        "filtered":{
            "filter":{
                "and":[
                    {"term":{"gender":"female"}},
                    {"range":{"lastlogindate":{"gte":"2016-10-19 12:39:57"}}}
                ]
            }
        }
    },
    "filter":{
        "and":[
            {"term":{"maritalstatus":"1"}}
        ]
    },
    "sort":[{"member2_dummy7":{"order":"desc"}}],
    "size":"600",
    "aggs": {

        "maritalstatus": {

            "filter": {},
            "aggs" : {

                "filtered_maritalstatus": {"terms":{"field":"maritalstatus","size":5000}}
            }
        },

        "relationship": {

            "filter": {"term":{"maritalstatus":"1"}},
            "aggs" : {

                "filtered_relationship": {"terms":{"field":"relationship","size":5000}}
            }
        }
    }
}

I converted same query in elastic 5.1 as given below:

{
    "_source":["_id","spotlight"],
    "query": {
        "bool": {
            "filter": [
                {"term":{"gender":"female"}},
                {"range":{"lastlogindate":{"gte":"2016-10-19 12:39:57"}}}
            ]
        }
    },
    "post_filter": {"term":{"maritalstatus":"1"}},
    "sort":[{"member2_dummy7":{"order":"desc"}}],
    "size":"600",
    "aggs": {

        "maritalstatus": {

            "filter": {},
            "aggs" : {

                "filtered_maritalstatus": {"terms":{"field":"maritalstatus","size":5000}}
            }
        },

        "relationship": {

            "filter": {"term":{"maritalstatus":"1"}},
            "aggs" : {

                "filtered_relationship": {"terms":{"field":"relationship","size":5000}}
            }
        }
    }
}

I ran both queries on two different AWS instances of same configuration and finally, I came to know that elastic-search 5.1 query is taking exactly twice time of elastic-search 1.5 query to execute.

Can some tell me why elastic-search 5.1 query is running slower than elastic-search 1.5???? Is there any more query optimization required to make it run fast??

It is not clear to me that the bool query is the issue here: many features that your query use have changed significantly between 1.5 and 5.1, such as range queries on date fields which now use a completely different data-structure.

Can you check that your maritalStatus field is mapped as a keyword rather than a number?
Why do you use an empty filter aggregation? You could put your filtered_maritalstatus aggregation directly at the top level?

Could you help isolate the problem by starting from a match_all query with a size of 0 and adding features of your request (size:600, aggs, post_filter, query) one by one to see which one is the bottleneck and is driving the slow down?

Following is mapping in elasticsearch 5.1:

{
    "settings": {
        "analysis": {
            "analyzer": {
                "lowercase_analyzer": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "filter": ["lowercase"]
                }
            }
        },
        "number_of_shards": 1,
        "number_of_replicas": 1
    },
    "mappings": {
        "profiles": {
            "properties": {
                "maritalstatus": {
                    "type": "text",
                    "fields": {
                        "raw": {
                            "type": "keyword",
                            "eager_global_ordinals": true
                        }
                    },
                    "analyzer": "lowercase_analyzer"
                },
                "relationship": {
                    "type": "text",
                    "fields": {
                        "raw": {
                            "type": "keyword",
                            "eager_global_ordinals": true
                        }
                    },
                    "analyzer": "lowercase_analyzer"
                },
                "gender": {
                    "type": "text",
                    "analyzer": "lowercase_analyzer"
                },
                "lastlogindate": {
                    "type": "date",
                    "format": "yyyy-MM-dd HH:mm:ss"
                }
            }
        }
    }
}

And below is mapping in elastcisearch 1.5:

{
    "settings": {
        "analysis": {
            "analyzer": {
                "string_lowercase": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "filter": "lowercase"
                }
            }
        },
        "number_of_shards": 1,
        "number_of_replicas": 1
    },
    "mappings": {
        "profiles": {
            "properties": {
                "maritalstatus": {
                    "type": "integer"
                },
                "relationship": {
                    "type": "string",
                    "fields": {
                        "raw": {
                            "type": "string",
                            "index": "not_analyzed"
                        }
                    },
                    "analyzer": "string_lowercase"
                },
                "gender": {
                    "type": "string",
                    "fields": {
                        "raw": {
                            "type": "string",
                            "index": "not_analyzed"
                        }
                    },
                    "analyzer": "string_lowercase"
                },
                "lastlogindate": {
                    "type": "date",
                    "format": "yyyy-MM-dd HH:mm:ss"
                }
            }
        }
    }
}

I tried reducing size to 500 and upto 100. But no change observed in response time. Elasticsearch 5.1 is taking 1.5 times more time than elasticsearch 1.5. Elasticsearch 1.5 is executing query very fast with "size":5000. I didn't understand what's causing query to run slower in elasticsearch 5.1

What about removing aggs and the post filter? What is the simplest query that reproduces the slow down?

I removed everything from query and kept just one condition and tested using jmeter.

Following is my elastic-search 1.5 query:

{
    "_source":["_id","spotlight"],
    "query":{
        "filtered":{
            "filter":{
                "and":[
                    {"term":{"gender":"female"}}
                ]
            }
        }
    }
}

Following is my elastic-search 5.1 query:

{
    "_source":["_id","spotlight"],
    "query": {
        "bool": {
            "filter": [
                {"term":{"gender":"female"}}
            ]
        }
    }
}

Elastic-search 1.5 query took 17 ms and Elastic-search 5.1 query took 23 ms. This is average time and this difference is consistent across all fields. I am not understanding why is it running slowly.

I suspect the difference could be due to the fact that ES 1.7 will cache the term filter in that case while 5.x doesn't do that anymore, since this strategy usually does not help with real-world workloads. Could you try to capture hot threads on both clusters while the query is running is a loop? https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-hot-threads.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.