Does filter order still matter in 5.6?


(Joe C) #1

https://stackoverflow.com/questions/34727193/elasticsearch-order-of-filters-for-best-performance and https://www.elastic.co/blog/better-query-execution-coming-elasticsearch-2-0 say that the 'modern' (>=2.x) elastic search ignores filter order defined by users and schedule them in the best way.

But this does not match that I profiled. Can profiler give incorrect results?

The profile API shows the original query below has Cumulative Time 115.735ms.

GET index/type/_search?human=true
{
  "profile": true,
  "query": {
    "bool": {
      "must": {
        "function_score": {
          "query": {
            "bool": {
              "must": {
                "query_string": {
                  "query": "foo", 
                  "fields": ["a", "b"], 
                  "default_operator": "OR", 
                  "analyzer": "default",
                  "auto_generate_phrase_queries": false, 
                  "lenient": true, "boost": 1.0
                }
              },
              "filter": {
                "bool": {
                  "must": [{"terms": {"ids": [1234]}}], 
                  "must_not": [{"term": {"is_bar": true}}]
                }
              }
            }
          }, 
          "functions": [
                    {"field_value_factor": {"field": "abc"}}
          ], 
          "score_mode": "multiply"
        }
      }, 
      "filter": {
          "bool": {
          "must_not": {
            "exists": {"field": "some_field"}
          }
        }
      }
    }
  }, 
  "from": 0, 
  "size": 20, 
  "timeout": "500ms"
}

If I place filters before other clauses, the profile time turns to be 54.816ms. Is this because of the effect of profilers?

GET index/type/_search?human=true
{
  "profile": true,
  "query": {
    "bool": {
      "filter": {
          "bool": {
            "must_not": {
              "exists": {"field": "some_field"}
              }
           }
        }
      "must": {
        "function_score": {
          "query": {
            "bool": {
              "filter": {
                "bool": {
                  "must": [{"terms": {"ids": [1234]}}], 
                  "must_not": [{"term": {"is_bar": true}}]
                 }
               },
              "must": {
                "query_string": {
                  "query": "foo", 
                  "fields": ["a", "b"], 
                  "default_operator": "OR", 
                  "analyzer": "default",
                  "auto_generate_phrase_queries": false, 
                  "lenient": true, "boost": 1.0
                }
              },
            
          }, 
          "functions": [
                    {"field_value_factor": {"field": "abc"}}
          ], 
          "score_mode": "multiply"
        }
      }, 
  }, 
  "from": 0, 
  "size": 20, 
  "timeout": "500ms"
}

(Joe C) #2

The first one is slow because some temporary cluster issue. At that time a node was slow.

When I try the two queries now, it returns almost the same time w/ and w/o profile APIs.

We may want to delete my question.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.