I've recently started supporting several ElasticSearch 2.4 clusters and am seeing issues occurring related to clients submitting bad queries.
What happens typically is a bad query comes in and causes the majority of the data nodes in the cluster to go into GC hell. The only way to fix this when it happens is to bounce the nodes and wait for the cluster to go green again.
Unfortunately, upgrading is not currently an option (this will happen eventually) so it doesn't seem I can do some of the nice defensive things in later versions like auto killing queries (as far as I can tell).
I was wondering what kind of advice the community would have for this situation? So far I am looking into killing bad queries (and ultimately fixing them) on the client side before they even get submitted to ES but curious as to what options there might be on the ES side as well.