Hello,
I noticed that some queries/aggregations were very slow related to their size, on elasticsearch 2.4.
curl -XPOST 'localhost:9201/logstash_netflow_client_v12-2016.11.11/_search?pretty' -d'
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "*",
"analyze_wildcard": true
}
},
"filter": {
"bool": {
"must": [
{
"query": {
"query_string": {
"query": "login:somelogin",
"analyze_wildcard": true
}
}
},
{
"range": {
"Timestamp": {
"gte": 1478873520648,
"lte": 1478877120648,
"format": "epoch_millis"
}
}
}
],
"must_not": []
}
}
}
}
}'
This request takes ~30 ms and returns 86 hits.
When I add this aggregation :
"size": 0,
"aggs": {
"2": {
"aggs": {
"1": {
"sum": {
"field": "bits"
}
}},
"terms": {
"field": "talking",
"size": 10,
"order": {
"1": "desc"
}
}
}
}
I notice that the aggregation takes me ~10s. Here, "talking" is a not analyzed string and bits a numeric value. When I change this aggregation for an aggregation on a numeric field (ex: packets)
instead of talking, it takes ~60ms (which is expected).
When I change it for another string field (login), it only takes a few ms.
The format of login is content@content, whereas the format of talking is IP:PORT<->IP2:PORT2. Before, it was "" instead of "<->" but I suspected this character could be the cause of the length.
So, does anyone have an idea of the problem here, or how can I troubleshoot it ?
Thank you,
Regards,
Grégoire Leroy