Does update_by_query always reindex entire document

I'm trying to figure out if the _update_by_query elasticsearch endpoint reindexes entire documents. I ran a batch process that generated and ran thousands of update_by_query statements. My CPU escalated after some time so I stopped running them. That was about a week ago, and my CPU is still abnormally high.

When I check the nodes in my cluster, one of them has unusually high processing percentage. I checked the hot threads against that node and it appears to still be processing update tasks. I stopped running the updates over a week ago. How could this node still be processing updates? My thinking was that it's reindexing documents that were affected by the updates.

Please share any thoughts.

The tasks API should tell you if you are still running the _update_by_query. IIRC one of the funny things about _update_by_query is that it'll perform a noop update of all documents if you give it an empty body. This might be what is going on here.

So the tasks that are running against the heavy processing node look like this..

I don't think these are _update_by_query tasks, but the hot threads appeared otherwise.

"C_i25yS5SWSdrv3NPEoHEA": {
        "name": "C_i25yS",
        "roles": [
            "data",
            "ingest"
        ],
        "tasks": {
            "C_i25yS5SWSdrv3NPEoHEA:186466469": {
                "node": "C_i25yS5SWSdrv3NPEoHEA",
                "id": 186466469,
                "type": "transport",
                "action": "cluster:monitor/tasks/lists",
                "start_time_in_millis": 1548856970331,
                "running_time_in_nanos": 2978262,
                "cancellable": false,
                "headers": {}
            },
            "C_i25yS5SWSdrv3NPEoHEA:186466471": {
                "node": "C_i25yS5SWSdrv3NPEoHEA",
                "id": 186466471,
                "type": "direct",
                "action": "cluster:monitor/tasks/lists[n]",
                "start_time_in_millis": 1548856970333,
                "running_time_in_nanos": 101917,
                "cancellable": false,
                "parent_task_id": "C_i25yS5SWSdrv3NPEoHEA:186466469",
                "headers": {}
            },
            "C_i25yS5SWSdrv3NPEoHEA:186466470": {
                "node": "C_i25yS5SWSdrv3NPEoHEA",
                "id": 186466470,
                "type": "netty",
                "action": "internal:discovery/zen/publish/commit",
                "start_time_in_millis": 1548856970332,
                "running_time_in_nanos": 1706067,
                "cancellable": false,
                "headers": {}
            }
        }
    }

The action on the task would be something like indices:data/write/update/byquery if it was an update by query still running.

@nik9000 Any other thoughts here as to why CPU would still be so high?

Right, update by query isn't running. I'd check the hot_threads API. If the CPU is super high on any one task it'll jump right out.

@nik9000 The hot_threads aren't particularly helpful to me, not really sure what to take away from it. Here's what it's reading for the heavy CPU node. Nothing really stands out to me. The response also constantly changes.

"::: {C_i25yS}{C_i25yS5SWSdrv3NPEoHEA}{IlzCv7ROQSaMm6IGV_Z88A}{x.x.x.x}{x.x.x.x:9300}{zone=us-west-2b}\n Hot threads at 2019-01-31T20:38:49.441Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:\n \n 68.3% (341.5ms out of 500ms) cpu usage by thread 'elasticsearch[C_i25yS][clusterApplierService#updateTask][T#1]'\n 2/10 snapshots sharing following 13 elements\n org.elasticsearch.indices.store.IndicesStore$ShardActiveResponseHandler.lambda$allNodesResponded$2(IndicesStore.java:289)\n org.elasticsearch.indices.store.IndicesStore$ShardActiveResponseHandler$$Lambda$1704/1699952741.accept(Unknown Source)\n org.elasticsearch.cluster.service.ClusterApplierService.lambda$runOnApplierThread$0(ClusterApplierService.java:307)\n org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$1706/1089848983.apply(Unknown Source)\n org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.apply(ClusterApplierService.java:156)\n org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:400)\n org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)\n org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:573)\n org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)\n org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)\n java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n java.lang.Thread.run(Thread.java:748)\n 8/10 snapshots sharing following 2 elements\n java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n java.lang.Thread.run(Thread.java:748)\n\n"

For some reason the human and pretty URL params do not format the response at all.

@nik9000 Do you see anything in the hot threads above that may be useful? Or anything else that may help resolve my issue?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.