[SOLVED] Thread pool for Painless aggregation


(Christophe Bismuth) #1

Hi,

We have 46% of idle CPU while executing heavy Painless aggregations. We wish we could consume this idle CPU time.

Is there any thread pool to configure to do so?

Thank you,
Christophe


(Ryan Ernst) #2

Painless does not have it's own threadpool. It operates within the thread of whatever context is calling it. For aggregations, that would be the search threadpool.

However, are your search requests being throttled? If you are running few but heavy requests, increasing the threads will not help. Instead, you probably want to find the bottleneck. For example, depending on the aggregation, memory can be a limiting factor. You can check many of the relevant stats with the stats api.


(Christophe Bismuth) #3

Thank you, we've increased RAM and heap from 16 Go (8 Go heap) to 32 Go (16 Go heap) without any performance improvement.

We profiled our searches and aggregations with Elasticsearch Java DSL profile API, but search phase is fast, aggregations are 5x slower approx.

That's why we think we should increase shard count and node count.


(Ryan Ernst) #4

Increasing number of shards and nodes might help. It depends on how many shards/nodes you currently have, and again, what types of aggregations you are running. Some are more memory intensive than others.


(Christophe Bismuth) #5

We have an indices of 10 Go with 10 000 000 documents split in 5 shards / 1 replica over 3 nodes.

We plan to upgrade to 5 nodes.

Besides, when we do nothing, 2 nodes out of 3 consume 10% of CPU user time. Is there any reason to do so?
With half memory assigned (16 Go RAM and 8 Go heap) those two nodes consume 20% CPU user time nothing is run against the cluster. We can't find out why. We are using Elastic 6.1.1.


(Bernt Rostad) #6

You could try to play with execution hint in your aggregations.

Last year my company struggled with a very slow aggregation query across 100+ million documents, it took minutes to complete. I managed to improve it almost a 100-fold by setting "execution_hint": "map", so it could be worth your effort.


(Christophe Bismuth) #7

Nice, thank you Bernt!


(Christophe Bismuth) #8

Thank you both @rjernst and @Bernt_Rostad, we've decreased vCPU count per node and increased node count.

Search thread pools are efficiently used.


(system) #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.