[SOLVED] Painless, shards and multi-threading

cbismuth · March 3, 2018, 12:45pm

Hi,

Let's say we have:

5 nodes
10 shards per index (with 1 replica)
10 vCPU per node (and 16 Go RAM with 8 Go heap)

Will shard-scoped Painless script executions be multi-threaded on each node?

E.g. will node 1 multi-thread execution of Painless script for shards 1 and 2? Or will node 1 execute Painless script for shard 1 and then for shard 2?

Thank you,
Christophe

rjernst · March 3, 2018, 4:34pm

All scripts (regardless of language) run in the thread of the calling context. So, for example, with a scoring script in a search, the script would be run on each shard independently, during query execution.

cbismuth · March 4, 2018, 10:39am

Alright, thank you Ryan.

Do you have any idea why we can't get our CPU cores work at 100%?

Disk or net I/O, and Java heap size don't seem to be a bottleneck.

We don't understand why CPU resources aren't fully consumed.

Thank you,
Christophe

cbismuth · March 5, 2018, 6:37am

Another question.

As you can see on the graphs below, node 6, 7, 8 and 9 always consume 15% of CPU without any user activity. Node 5 (indexation node) doesn't.

Is there any reason why?

Thanks again,
Christophe

cbismuth · March 6, 2018, 9:05am

We've enabled profiling from the Java DSL.

I can't understand why in a ConstantScoreQuery there are create_weight and build_scorer phases, is there any reason why?
I thought a ConstantScoreQuery was a way to completely disable scoring, wasn't it?

Thank you,
Chris

{
    "type": "ConstantScoreQuery",
    "description": "ConstantScore(+catalogId:{1166 1220 1232 1234 1235 1237 1238 1239 1240 1241 1242 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1274 1275 1278 1279 1280 1289 1291 1404 1405 1406 1408 1409 1410 1411 1412 1413 1414 1416 1417 1418 1419 1420 1477 1478 1479 1480 1482 1541 1546 1576 1645 1750 1751 1752 1753 2454 2529 2530 2531 2532 2533 2578 2677 5293 5303 8172 8228 8229 8230 8231 8232 8233 8235 8236 8237 8238 8239 8240 8241 8242} +ConstantScore(DocValuesFieldExistsQuery [field=price]))",
    "timings": {
        "score": 0,
        "build_scorer_count": 3,
        "match_count": 0,
        "create_weight": 4868,
        "next_doc": 12362443,
        "match": 0,
        "create_weight_count": 1,
        "next_doc_count": 138653,
        "score_count": 0,
        "build_scorer": 24261,
        "advance": 1224,
        "advance_count": 1
    }
}

cbismuth · March 6, 2018, 10:51am

Hum ... probably something in our infrastructure, the total aggregation time is 4062 ms and the shard max time is 796 ms.

Investigating ...

cbismuth · March 6, 2018, 1:52pm

It seems to be related to network bandwidth limitations.

system · April 3, 2018, 1:52pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[SOLVED] Thread pool for Painless aggregation Elasticsearch	8	953	March 31, 2018
How does painless work? Elasticsearch	4	907	December 19, 2018
One query thread per Shard? Elasticsearch	6	7133	February 13, 2017
Shards per CPU Elasticsearch	5	4115	July 5, 2017
Indexing to one shard is not concurrency-friendly? Elasticsearch	5	1032	July 5, 2017

[SOLVED] Painless, shards and multi-threading

Related topics