[SOLVED] Thread pool for Painless aggregation

cbismuth · February 22, 2018, 4:21pm

Hi,

We have 46% of idle CPU while executing heavy Painless aggregations. We wish we could consume this idle CPU time.

Is there any thread pool to configure to do so?

Thank you,
Christophe

rjernst · February 22, 2018, 5:14pm

Painless does not have it's own threadpool. It operates within the thread of whatever context is calling it. For aggregations, that would be the search threadpool.

However, are your search requests being throttled? If you are running few but heavy requests, increasing the threads will not help. Instead, you probably want to find the bottleneck. For example, depending on the aggregation, memory can be a limiting factor. You can check many of the relevant stats with the stats api.

cbismuth · February 22, 2018, 5:25pm

Thank you, we've increased RAM and heap from 16 Go (8 Go heap) to 32 Go (16 Go heap) without any performance improvement.

We profiled our searches and aggregations with Elasticsearch Java DSL profile API, but search phase is fast, aggregations are 5x slower approx.

That's why we think we should increase shard count and node count.

rjernst · February 22, 2018, 5:47pm

Increasing number of shards and nodes might help. It depends on how many shards/nodes you currently have, and again, what types of aggregations you are running. Some are more memory intensive than others.

cbismuth · February 23, 2018, 9:33am

We have an indices of 10 Go with 10 000 000 documents split in 5 shards / 1 replica over 3 nodes.

We plan to upgrade to 5 nodes.

Besides, when we do nothing, 2 nodes out of 3 consume 10% of CPU user time. Is there any reason to do so?
With half memory assigned (16 Go RAM and 8 Go heap) those two nodes consume 20% CPU user time nothing is run against the cluster. We can't find out why. We are using Elastic 6.1.1.

Bernt_Rostad · February 23, 2018, 10:47am

You could try to play with execution hint in your aggregations.

Last year my company struggled with a very slow aggregation query across 100+ million documents, it took minutes to complete. I managed to improve it almost a 100-fold by setting "execution_hint": "map", so it could be worth your effort.

cbismuth · February 23, 2018, 10:48am

Nice, thank you Bernt!

cbismuth · March 3, 2018, 12:48pm

Thank you both @rjernst and @Bernt_Rostad, we've decreased vCPU count per node and increased node count.

Search thread pools are efficiently used.

system · March 31, 2018, 12:48pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[SOLVED] Painless, shards and multi-threading Elasticsearch	7	1938	April 3, 2018
Thread selection and locking Elasticsearch	5	2973	July 5, 2017
Question about threadpools Elasticsearch	14	1016	April 30, 2021
Degraded Indexing Performance on v7.3.1 (from v5.6.10) Elasticsearch	6	406	March 27, 2020
Understanding Threadpools Elasticsearch	7	435	July 6, 2017

[SOLVED] Thread pool for Painless aggregation

Related topics