Bool Query: Order of Execution?

Hello!

I have an elasticsearch cluster with daily indexes for some history functions. For another function, I only need the data for the last hour.

My query for this looks something like this:

{
"query": {
"bool": {
"must": [
{
"range": {
"@timestamp": {
"gte": "now-1h"
}
}
},
{
"range": {
"field1": {
"lte": %{some_number}
}
}
},
{
"range": {
"field2": {
"gte": %{some_other_number}
}
}
},
{
"match": {
"textfield": "%{some_text}"
}
}
]
}
}
}

Only the documents which match all 4 of those clauses should be returned, which works great. Now, I have seen, that all Indices get queried, not only the newest one. This causes quite some load. According to this, the order of the clauses is determined by elasticsearch itself.

Can I influence this execution order somehow? i.e. I would like to execute the @timestamp-range clause first. Maybe a nested bool query...? And how does this work then? I don't see the sense in querying all of the indices because the needed documents clearly are in the newest one (or two) indices.

So I made a workaround. It doesn's answer my initial question though...

The initial Problem was, that all indices were queried. I have daily indices like:

index-...
index-2019.11.03
index-2019.11.04
index-2019.11.05

This was, because the range-clause with the @timestamp-filter was executed later than some other clauses in the bool query.

To mitigate this issue, i'm querying the newest two indices (to include the edge cases around midnight) with the help of some date-math from logstash. In my case, with elasticsearch, the elasticsearch-filter looks something like this now:

elasticsearch{
   hosts => ["es1:9200", "es2:9200", "es3:9200"]
   index => ["<index-{now/d}>,<index-{now-1d/d}>"]
   ...
}

This reduced the load on my cluster massively.
Hope this helps someone with a similar issue!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.