Query_string performance issue

Hi!

Initially I posted this issue in Kibana but I was asked to post it here as well:
Kibana: Query_string performance

We recently upgraded Elasticsearch and Kibana from 5.3.0 to 5.4.0 and since then we experience performance issues with dashboards and visualizations. Here is the query that is generated by Kibana for a visualization on the .monitoring-es-2-* index:

{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "analyze_wildcard": true,
            "query": "*"
          }
        },
        {
          "query_string": {
            "analyze_wildcard": true,
            "query": "*"
          }
        },
        {
          "range": {
            "timestamp": {
              "gte": 1495455394174,
              "lte": 1495541794174,
              "format": "epoch_millis"
            }
          }
        }
      ],
      "must_not": []
    }
  },
  "size": 0,
  "_source": {
    "excludes": []
  },
  "aggs": {
    "2": {
      "date_histogram": {
        "field": "timestamp",
        "interval": "30m",
        "time_zone": "Europe/Berlin",
        "min_doc_count": 1
      },
      "aggs": {
        "3": {
          "terms": {
            "field": "source_node.name",
            "size": 30,
            "order": {
              "1": "desc"
            }
          },
          "aggs": {
            "1": {
              "max": {
                "field": "node_stats.process.cpu.percent"
              }
            }
          }
        }
      }
    }
  },
  "version": true,
  "highlight": {
    "pre_tags": [
      "@kibana-highlighted-field@"
    ],
    "post_tags": [
      "@/kibana-highlighted-field@"
    ],
    "fields": {
      "*": {
        "highlight_query": {
          "bool": {
            "must": [
              {
                "query_string": {
                  "analyze_wildcard": true,
                  "query": "*",
                  "all_fields": true
                }
              },
              {
                "query_string": {
                  "analyze_wildcard": true,
                  "query": "*",
                  "all_fields": true
                }
              },
              {
                "range": {
                  "timestamp": {
                    "gte": 1495455394174,
                    "lte": 1495541794174,
                    "format": "epoch_millis"
                  }
                }
              }
            ],
            "must_not": []
          }
        }
      }
    },
    "fragment_size": 2147483647
  }
}

Side note: the query_string in the query and in the highlight_query are duplicated.

To analyze the problem we copied the query into the dev tools (sense) and executed them.
Using the exact same query results in a response time of 10+ seconds.
Removing the query_string blocks results in a response time of 1+ seconds.

As we experience the problem in Kibana, I decided to post it in this category, but it could be also a problem of Elasticsearch. What do you think?

I quickly compared the results with and without the query_string and I don't see any difference. Why is the query_string needed/used?

Thanks in advance for your help!

We found a workaround for the problem by disabling the _all field and using index.query.default_field to avoid search every "queryable" field in the mapping.

Hi @baltendo

I was wondering, can you see how the query is being rewritten with and without the query_string parts? You should be able to do (replace yourindex with your index name):

POST /yourindex/_validate/query?rewrite&explain
{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "analyze_wildcard": true,
            "query": "*"
          }
        },
        {
          "query_string": {
            "analyze_wildcard": true,
            "query": "*"
          }
        },
        {
          "range": {
            "timestamp": {
              "gte": 1495455394174,
              "lte": 1495541794174,
              "format": "epoch_millis"
            }
          }
        }
      ],
      "must_not": []
    }
  }
}

And then the same thing without the query_string parts

POST /yourindex/_validate/query?rewrite&explain
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "timestamp": {
              "gte": 1495455394174,
              "lte": 1495541794174,
              "format": "epoch_millis"
            }
          }
        }
      ],
      "must_not": []
    }
  }
}

Can you run those and paste the results here so I can see how it's being rewritten? When I locally run this the query is rewritten as +*:* +*:* for the query_string parts.

Hi @dakrone!

We fixed our performance issues when using Kibana by specifying a default query field. So I think this changes also influences the data that you want me to post here, right?

Nevertheless this is the result for the query_string approach:

{
      "index": "yourindex",
      "valid": true,
      "explanation": """+ConstantScore(_field_names:yourFieldName) +ConstantScore(_field_names:yourFieldName) +MatchNoDocsQuery["User requested "match_none" query."]"""
}

And for the none query_string approach:

{
      "index": "yourindex",
      "valid": true,
      "explanation": """MatchNoDocsQuery["User requested "match_none" query."]"""
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

To follow-up on this, https://github.com/elastic/elasticsearch/pull/25726 has been opened and will address the performance issues with the query.