How to handle long running queries

I've found a few older posts regarding this, but not too much around long running queries on newer versions of Elasticsearch.

Just a little background, I manage a 200+ node Elastic cluster for my company. We recently upgraded to version 6.2.4 across the board. Another important thing to note, our data nodes run on spinning disk. Our ES stack is the sole source of logs for developers and operations alike, so we get a wide range of queries, something that causes the cluster grief fairly frequently.

Anyway, I have been testing ways to curtail some of these bad queries that can "lock up" the cluster. One in particular has been setting the search.default_search_timeout to 30s. However, it doesn't seem like this setting is applying to all searches, as we will still see queries returning after 60+ seconds. Namely ones that originate from Kibana (most of our users utilize Kibana and not the API directly). Are there additional settings that can assist us with stopping or limiting the impact of bad queries from Kibana? Perhaps we have missed something in the documentation.

Here are the current cluster configuration settings:

{
"persistent" : {
"cluster" : {
"routing" : {
"allocation" : {
"disk" : {
"threshold_enabled" : "true",
"watermark" : {
"low" : "90%",
"flood_stage" : "95%",
"high" : "90%"
}
}
}
}
}
},
"transient" : {
"cluster" : {
"routing" : {
"allocation" : {
"balance" : {
"threshold" : "100"
},
"enable" : "all",
"same_shard" : {
"host" : "false"
}
}
}
},
"indices" : {
"recovery" : {
"max_bytes_per_sec" : "200mb"
}
},
"search" : {
"default_search_timeout" : "30s"
},
"logger" : {
"_root" : "INFO"
}
}
}

Thank you for your help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.