[.kibana_task_manager] Action failed with 'Request timed out'

Hello,

I upgraded an Elasticsearch cluster from 7.10 to 7.17.9. ES upgrade is fine, with all the nodes up. However, when upgrading Kibana, I got the error when it attempts to re-index.

{"type":"log","@timestamp":"2023-08-08T08:21:18+02:00","tags":["info","savedobjects-service"],"pid":70955,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_INDEX_BULK -> REINDEX_SOURCE_TO_TEMP_INDEX_BULK. took: 120373ms."}
{"type":"log","@timestamp":"2023-08-08T08:23:19+02:00","tags":["error","savedobjects-service"],"pid":70955,"message":"[.kibana_task_manager] Action failed with 'Request timed out'. Retrying attempt 2 in 4 seconds."}

It goes on in a loop for 20 minutes, then it starts again.
This is another log;

{"type":"log","@timestamp":"2023-08-08T08:16:48+02:00","tags":["info","savedobjects-service"],"pid":69308,"message":"[.kibana_task_manager] REINDEX_SOURCE_TO_TEMP_INDEX_BULK -> FATAL. took: 184011ms."}
{"0":"{","1":"\"","2":"s","3":"u","4":"c","5":"c","6":"e","7":"e","8":"d","9":"e","10":"d","11":"\"","12":":","13":"f","14":"a","15":"l","16":"s","17":"e","18":",","19":"\"","20":"n","21":"u","22":"m","23":"_","24":"f","25":"r","26":"e","27":"e","28":"d","29":"\"","30":":","31":"0","32":"}","type":"log","@timestamp":"2023-08-08T08:16:48+02:00","tags":["warning","savedobjects-service"],"pid":69308,"message":"Failed to cleanup after migrations:"}
{"type":"log","@timestamp":"2023-08-08T08:16:48+02:00","tags":["fatal","root"],"pid":69308,"message":"Error: Unable to complete saved object migrations for the [.kibana_task_manager] index: Unable to complete the REINDEX_SOURCE_TO_TEMP_INDEX_BULK step after 15 attempts, terminating.\n    at migrationStateActionMachine (/usr/share/kibana/src/core/server/saved_objects/migrationsv2/migrations_state_action_machine.js:144:29)\n    at processTicksAndRejections (node:internal/process/task_queues:96:5)\n    at async Promise.all (index 1)\n    at SavedObjectsService.start (/usr/share/kibana/src/core/server/saved_objects/saved_objects_service.js:181:9)\n    at Server.start (/usr/share/kibana/src/core/server/server.js:330:31)\n    at Root.start (/usr/share/kibana/src/core/server/root/index.js:69:14)\n    at bootstrap (/usr/share/kibana/src/core/server/bootstrap.js:120:5)\n    at Command.<anonymous> (/usr/share/kibana/src/cli/serve/serve.js:229:5)"}
{"type":"log","@timestamp":"2023-08-08T08:16:48+02:00","tags":["info","plugins-system","standard"],"pid":69308,"message":"Stopping all plugins."}

I stopped and started kibana. This time, the log changed,

{"type":"log","@timestamp":"2023-08-08T08:49:23+02:00","tags":["error","savedobjects-service"],"pid":71887,"message":"[.kibana] Action failed with 'no_shard_available_action_exception: [no_shard_available_action_exception] Reason: No shard available for [get [.tasks][task][FNZgMx55SGSsn4a9rs0WKQ:7782680]: routing [null]]'. Retrying attempt 5 in 32 seconds."}

More logs:

{
  "cluster_name" : "mycluster",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 4826,
  "active_shards" : 5582,
  "relocating_shards" : 0,
  "initializing_shards" : 8,
  "unassigned_shards" : 2718,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 8,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 1585,
  "active_shards_percent_as_number" : 67.18825228695233
}


ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
[redacted]               65          99   9    8.75    7.47     7.67 cdfhilmrstw -      node1
[redacted]               54          91   7    1.00    0.85     1.53 cdfhilmrstw -      node2
[redacted]               32          99  14    9.58   10.55    10.53 cdfhilmrstw *      node3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.