[SOLVED] Painless, shards and multi-threading


(Christophe Bismuth) #1

Hi,

Let's say we have:

  • 5 nodes
  • 10 shards per index (with 1 replica)
  • 10 vCPU per node (and 16 Go RAM with 8 Go heap)

Will shard-scoped Painless script executions be multi-threaded on each node?

E.g. will node 1 multi-thread execution of Painless script for shards 1 and 2? Or will node 1 execute Painless script for shard 1 and then for shard 2?

Thank you,
Christophe


(Ryan Ernst) #2

All scripts (regardless of language) run in the thread of the calling context. So, for example, with a scoring script in a search, the script would be run on each shard independently, during query execution.


(Christophe Bismuth) #3

Alright, thank you Ryan.

Do you have any idea why we can't get our CPU cores work at 100%?

Disk or net I/O, and Java heap size don't seem to be a bottleneck.

We don't understand why CPU resources aren't fully consumed.

Thank you,
Christophe


(Christophe Bismuth) #4

Another question.

As you can see on the graphs below, node 6, 7, 8 and 9 always consume 15% of CPU without any user activity. Node 5 (indexation node) doesn't.

Is there any reason why?

Thanks again,
Christophe


(Christophe Bismuth) #5

We've enabled profiling from the Java DSL.

I can't understand why in a ConstantScoreQuery there are create_weight and build_scorer phases, is there any reason why?
I thought a ConstantScoreQuery was a way to completely disable scoring, wasn't it?

Thank you,
Chris

{
    "type": "ConstantScoreQuery",
    "description": "ConstantScore(+catalogId:{1166 1220 1232 1234 1235 1237 1238 1239 1240 1241 1242 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1274 1275 1278 1279 1280 1289 1291 1404 1405 1406 1408 1409 1410 1411 1412 1413 1414 1416 1417 1418 1419 1420 1477 1478 1479 1480 1482 1541 1546 1576 1645 1750 1751 1752 1753 2454 2529 2530 2531 2532 2533 2578 2677 5293 5303 8172 8228 8229 8230 8231 8232 8233 8235 8236 8237 8238 8239 8240 8241 8242} +ConstantScore(DocValuesFieldExistsQuery [field=price]))",
    "timings": {
        "score": 0,
        "build_scorer_count": 3,
        "match_count": 0,
        "create_weight": 4868,
        "next_doc": 12362443,
        "match": 0,
        "create_weight_count": 1,
        "next_doc_count": 138653,
        "score_count": 0,
        "build_scorer": 24261,
        "advance": 1224,
        "advance_count": 1
    }
}

(Christophe Bismuth) #6

Hum ... probably something in our infrastructure, the total aggregation time is 4062 ms and the shard max time is 796 ms.

Investigating ...


(Christophe Bismuth) #7

It seems to be related to network bandwidth limitations.


(system) #8

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.