Due to the nature of the data I work with I have to create buckets of 5 minutes and then inside the bucket to calculate the count of the unique values (cardinality).
Up until recently in our system those unique values wouldn't be more than 2-3K per bucket, but due to more traffic, the cardinality of that value increased and can go up to 15K.
After looking at the cardinality query I saw that the default precision_threshold is 3000, but my query takes around 10 seconds to complete.
I tried using 100 as the precision and the query went down to 4 seconds which is much better, but not great in my use case .
From trying several queries I understood that the response time is a function of how big the cardinality is on those buckets that I'm computing, which makes sense.
What I don't completely understand is, if from a shard the response is ~500ms up to 900ms (profiled the query), how come the response comes back in 10seconds. Shouldn't the queries get executed in parallel and then aggregated?
Also since this query takes the same time every time I execute it, I assume that the aggregations are not getting cached. I did check the request caches for those indices and they were increasing after the first query.
Yes that's correct.
I just had a flash... I have 10 nodes that each has 4 cores, Elasticsearch assigns 7 search threads to each node.
With this query I'm trying to reach 31 indices * 5 shards = 155 shards in total.
I'm saying this because in my older cluster (which btw was in a datacenter) I had 50 cores per node * 3 nodes = 150 search threads but now I have 70 in total (new setup in the cloud).
There is not much point in increasing the thread pool size unless you have threads to back it up with. The defaults are generally quite good. You may instead look to reduce the number of shards do that there is rewear yo process, resulting in less tasks being queued up.