Hi all,
I've a generic question about the query execution time.
I'm using Kibana to query a Elasticsearch server.
Regardless of the server hardware (I tried to query three different ES cluster) and regardless of the query, I noticed that:
- Two identical queries executed after long time from each other, takes a long time.
- Instead if they are performed close to each other, the second query take very short time.
- This happens even if cache is explicitally clered before executing the second query.
Some numerical data with a real case:
I'm querying ES to have the documents number at 1 day intervals, with 2 other sub-aggregations
POST monthly_myindex_20*/_search?request_cache=false
{
"query": {
"bool": {
"must": [
{"term": {
"tratta": {
"value": "872"
}
}}
]
}
}
,
"aggs": {
"dayAggs": {
"date_histogram": {
"field": "timestamp",
"interval": "1d",
"min_doc_count": 0
},
"aggs": {
"fiel1Aggs": {
"terms": {
"field": "field1",
"size": 10
},
"aggs": {
"fiel2Aggs": {
"terms": {
"field": "field2",
"size": 10
}
}
}
}
}
}
}
}
- Query executed for the first time in the day: took=16366ms
Cache cleared:
POST /_cache/clear
- Query re-excecuted after few seconds: took = 1110ms
Cache cleared:
POST /_cache/clear
- Query re-excecuted after few seconds: took = 324ms
All indexes involved in the query are hot.
I suppose there are other factors, in addition to cache, that affect the response time. But I've no idea.
If the value of took moves so much it is very difficult to make the query tuned.
Can anyone help me to explain this behavior?
Thanks