Query execution time

Hi all,

I've a generic question about the query execution time.

I'm using Kibana to query a Elasticsearch server.

Regardless of the server hardware (I tried to query three different ES cluster) and regardless of the query, I noticed that:

  • Two identical queries executed after long time from each other, takes a long time.
  • Instead if they are performed close to each other, the second query take very short time.
  • This happens even if cache is explicitally clered before executing the second query.

Some numerical data with a real case:

I'm querying ES to have the documents number at 1 day intervals, with 2 other sub-aggregations

POST monthly_myindex_20*/_search?request_cache=false
{
  "query": {
    "bool": {
      "must": [
        {"term": {
          "tratta": {
            "value": "872"
          }
        }}
      ]
    }
  }
  ,
  "aggs": {
    "dayAggs": {
      "date_histogram": {
        "field": "timestamp",
        "interval": "1d",
        "min_doc_count": 0
      },
      "aggs": {
        "fiel1Aggs": {
          "terms": {
            "field": "field1",
            "size": 10
          },
          "aggs": {
            "fiel2Aggs": {
              "terms": {
                "field": "field2",
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}
  • Query executed for the first time in the day: took=16366ms

Cache cleared:
POST /_cache/clear

  • Query re-excecuted after few seconds: took = 1110ms

Cache cleared:
POST /_cache/clear

  • Query re-excecuted after few seconds: took = 324ms

All indexes involved in the query are hot.

I suppose there are other factors, in addition to cache, that affect the response time. But I've no idea.
If the value of took moves so much it is very difficult to make the query tuned.

Can anyone help me to explain this behavior?
Thanks

this is likely the effect of filling up the filesystem cache. you can't clear that cache through elasticsearch.

Hello Simon,

thank for your answer.

I'm reading a few documentation about Elasticsearch. In fact it seems that the filesystem cache plays a very important role in query execution time.

I'm also searching for a method that permits to clear
the filesystem cache under linux.

Maybe you have a hint...

# sync; echo 1 > /proc/sys/vm/drop_caches

Should do the trick

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.