Running tasks get terminated when cluster is upgraded from 6.6.2 to 7.10.2 using full cluster restart upgrade approach

yabhishek · February 9, 2021, 1:23am

Hi
I tried to upgrade my elasticsearch cluster running version 6.6.2 to version 7.10.2 using full cluster restart upgrade approach. Before performing upgrade, I had some tasks running as can be seen in the following ouput -

    "kXxifenTS-eG4airQq7e4g" : {
      "name" : "qaes-testesupdadata1",
      "transport_address" : "10.109.12.217:9300",
      "host" : "10.109.12.217",
      "ip" : "10.109.12.217:9300",
      "roles" : [
        "master",
        "data"
      ],
      "attributes" : {
        "ml.machine_memory" : "4085968896",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true",
        "ml.enabled" : "true"
      },
      "tasks" : {
        "kXxifenTS-eG4airQq7e4g:305558" : {
          "node" : "kXxifenTS-eG4airQq7e4g",
          "id" : 305558,
          "type" : "netty",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1612777918993,
          "running_time_in_nanos" : 65061,
          "cancellable" : false,
          "parent_task_id" : "dhqNUp-ZScu7KFVPGyREfA:326955",
          "headers" : { }
        },
        "kXxifenTS-eG4airQq7e4g:281310" : {
          "node" : "kXxifenTS-eG4airQq7e4g",
          "id" : 281310,
          "type" : "transport",
          "action" : "indices:data/write/update/byquery",
          "start_time_in_millis" : 1612777692354,
          "running_time_in_nanos" : 226639681547,
          "cancellable" : true,
          "headers" : { }
        },

So I shutdown the nodes and perform the upgrade and when upgraded nodes restart, they join the cluster, shards recover completely and nodes are in healthy state but these tasks are terminated. Is there a way to get around with this or should I prefer rolling upgrade as my application will have tasks running even after I stop indexing before upgrade. Please help!

warkolm · February 9, 2021, 2:14am

Welcome to our community!

Are you asking how to restart the indices:data/write/update/byquery task after the nodes have shut down? If so there is no way to do this without re-running the original request, which looks like an update-by query.

yabhishek · February 9, 2021, 2:45am

Thanks @warkolm. Would rolling upgrade ensure that such tasks on the node being upgraded won't be terminated rather moved to another running node?

warkolm · February 9, 2021, 2:46am

I guess that depends on what the job was doing, but in theory that should work.

yabhishek · February 9, 2021, 2:48am

I will have many indexing, add_tags and query operations running while I would be upgrading a node. Would those operations be affected by the rolling upgrade although documentation says they shouldn't be?

warkolm · February 9, 2021, 2:49am

They shouldn't be as long as you upgrade a node at a time, and then wait for the cluster to be green before moving to the next node.

yabhishek · February 9, 2021, 2:50am

Thanks @warkolm that helps. I will try that on a test cluster before upgrading the production env and see what the behaviour is. Thanks!

yabhishek · February 10, 2021, 9:26am

@warkolm So we thought of going the rolling upgrade way but just wanted to test if shutting down a node would terminate the running merge/add_tag operations and found that the operations were terminated. So rolling upgrade doesn't seem to solving the problem. Is there a way to force move tasks from one node to another before shutting down the node?

system · March 10, 2021, 9:27am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.