Hi,
We're experiencing a strange behavior with an aggregation.
The below query is taking 1 - 1.5 seconds to execute.
GET index/type/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"term": {
"query": {
"value": "some search query"
}
}
}
]
}
},
"aggs": {
"queries": {
"terms": {
"field": "query.keyword",
"size": 10
}
}
}
}
After a thorough investigation, we've found that the slowness comes from the aggregation on this specific field - query.keyword. The interesting thing here is that the aggregation is slow no matter if the size of the result-set is 40,000 or 1 document - the runtime is exactly the same.
However, you can get fast results with the aggregation if you are aggregating on a different field, for example title (also keyword, see mapping below) . So it seems to be something related to the query field and/or it's subfields.
Here are the mappings:
...
"query": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "keyword"
}
...
The index shard setup is 2p+1r, with a total of 9,992,927 documents.
Elasticsearch 5.3.2
Some statistics:
Query 1 - without aggregation
Returned docs: 1
Execution time: 34ms
Query 2 - without aggregation
Returned docs 40,000
Execution time: 35ms
Query 1 - with aggregation on query.keyword
Returned docs: 1
Execution time: 1427ms
Query 2 - with aggregation on query.keyword
Returned docs: 40,000
Execution time: 1350ms
Query 1 - with aggregation on title
Returned docs: 1
Execution time: 56ms
Query 2 - with aggregation title
Returned docs: 40,000
Execution time: 63ms
Question:
So the question is why aggregating on query.keyword is slowing down the query no matter what size the result set is filtered down to. The aggregation should only run on the result-set, right?!?
Does anyone have any thoughts on this or how I debug it further?