Migration timeout when upgrading Kibana from 7.15.1 to 7.15.2 or 7.16.0

Hi,

my kibana installation fails to upgrade from 7.15.1 to 7.15.2 and 7.16.0 during the CLONE_TEMP_TO_TARGET -> CLONE_TEMP_TO_TARGET step. Also had the exact same problem from 7.13.1 to 7.13.2 but cant remember how I fixed it or if it simply worked when upgrading from 7.13.1 to 7.14.x

kibana.log contains the following showing the migration timeout

{"type":"log","@timestamp":"2021-12-09T14:32:38+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] Starting to process 3402 documents."}
{"type":"log","@timestamp":"2021-12-09T14:32:38+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_READ -> REINDEX_SOURCE_TO_TEMP_TRANSFORM. took: 271ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:38+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana_task_manager] SET_TEMP_WRITE_BLOCK -> CLONE_TEMP_TO_TARGET. took: 168ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:39+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_TRANSFORM -> REINDEX_SOURCE_TO_TEMP_INDEX_BULK. took: 135ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_INDEX_BULK -> REINDEX_SOURCE_TO_TEMP_READ. took: 2254ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] Processed 1000 documents out of 3402."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_READ -> REINDEX_SOURCE_TO_TEMP_TRANSFORM. took: 109ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_TRANSFORM -> REINDEX_SOURCE_TO_TEMP_INDEX_BULK. took: 66ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_INDEX_BULK -> REINDEX_SOURCE_TO_TEMP_READ. took: 347ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] Processed 2000 documents out of 3402."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_READ -> REINDEX_SOURCE_TO_TEMP_TRANSFORM. took: 79ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:41+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_TRANSFORM -> REINDEX_SOURCE_TO_TEMP_INDEX_BULK. took: 53ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_INDEX_BULK -> REINDEX_SOURCE_TO_TEMP_READ. took: 302ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] Processed 3000 documents out of 3402."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_READ -> REINDEX_SOURCE_TO_TEMP_TRANSFORM. took: 26ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_TRANSFORM -> REINDEX_SOURCE_TO_TEMP_INDEX_BULK. took: 15ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_INDEX_BULK -> REINDEX_SOURCE_TO_TEMP_READ. took: 176ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] Processed 3402 documents out of 3402."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_READ -> REINDEX_SOURCE_TO_TEMP_CLOSE_PIT. took: 4ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] REINDEX_SOURCE_TO_TEMP_CLOSE_PIT -> SET_TEMP_WRITE_BLOCK. took: 3ms."}
{"type":"log","@timestamp":"2021-12-09T14:32:42+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] SET_TEMP_WRITE_BLOCK -> CLONE_TEMP_TO_TARGET. took: 137ms."}
{"type":"log","@timestamp":"2021-12-09T14:33:08+00:00","tags":["error","savedobjects-service"],"pid":342653,"message":"[.kibana_task_manager] Action failed with 'Request timed out'. Retrying attempt 1 in 2 seconds."}
{"type":"log","@timestamp":"2021-12-09T14:33:08+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana_task_manager] CLONE_TEMP_TO_TARGET -> CLONE_TEMP_TO_TARGET. took: 30002ms."}
{"type":"log","@timestamp":"2021-12-09T14:33:12+00:00","tags":["error","savedobjects-service"],"pid":342653,"message":"[.kibana] Action failed with 'Request timed out'. Retrying attempt 1 in 2 seconds."}
{"type":"log","@timestamp":"2021-12-09T14:33:12+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana] CLONE_TEMP_TO_TARGET -> CLONE_TEMP_TO_TARGET. took: 30001ms."}
{"type":"log","@timestamp":"2021-12-09T14:35:10+00:00","tags":["error","savedobjects-service"],"pid":342653,"message":"[.kibana_task_manager] Action failed with 'Request timed out'. Retrying attempt 2 in 4 seconds."}
{"type":"log","@timestamp":"2021-12-09T14:35:10+00:00","tags":["info","savedobjects-service"],"pid":342653,"message":"[.kibana_task_manager] CLONE_TEMP_TO_TARGET -> CLONE_TEMP_TO_TARGET. took: 122010ms."}

And goes over and over.

My Elasticsearch master node logs contains the following

[2021-12-09T14:32:37,714][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] adding block write to indices [[.kibana_7.15.1_001/fxlqME6zQw6xRVDK9CvfTA]]
[2021-12-09T14:32:37,814][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] adding block write to indices [[.kibana_task_manager_7.15.1_001/-oonGAVTRq26yn9UPJBIAA]]
[2021-12-09T14:32:37,916][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] completed adding block write to indices [.kibana_7.15.1_001]
[2021-12-09T14:32:38,072][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] completed adding block write to indices [.kibana_task_manager_7.15.1_001]
[2021-12-09T14:32:38,160][INFO ][o.e.c.m.MetadataCreateIndexService] [elom04] [.kibana_7.16.0_reindex_temp] creating index, cause [api], templates [], shards [1]/[1]
[2021-12-09T14:32:38,279][INFO ][o.e.c.m.MetadataCreateIndexService] [elom04] [.kibana_task_manager_7.16.0_reindex_temp] creating index, cause [api], templates [], shards [1]/[1]
[2021-12-09T14:32:38,616][INFO ][o.e.c.r.a.AllocationService] [elom04] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.kibana_task_manager_7.16.0_reindex_temp][0], [.kibana_7.16.0_reindex_temp][0]]]).
[2021-12-09T14:32:38,739][INFO ][o.e.c.m.MetadataMappingService] [elom04] [.kibana_task_manager_7.16.0_reindex_temp/7hnRd9EtR-2yg-MAsWS8WA] update_mapping [_doc]
[2021-12-09T14:32:38,788][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] adding block write to indices [[.kibana_task_manager_7.16.0_reindex_temp/7hnRd9EtR-2yg-MAsWS8WA]]
[2021-12-09T14:32:38,898][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] completed adding block write to indices [.kibana_task_manager_7.16.0_reindex_temp]
[2021-12-09T14:32:38,957][INFO ][o.e.c.m.MetadataCreateIndexService] [elom04] applying create index request using existing index [.kibana_task_manager_7.16.0_reindex_temp] metadata
[2021-12-09T14:32:38,959][INFO ][o.e.c.m.MetadataCreateIndexService] [elom04] [.kibana_task_manager_7.16.0_001] creating index, cause [clone_index], templates [], shards [1]/[1]
[2021-12-09T14:32:38,998][INFO ][o.e.c.r.a.AllocationService] [elom04] Cluster health status changed from [YELLOW] to [RED] (reason: [index [.kibana_task_manager_7.16.0_001] created]).
[2021-12-09T14:32:39,289][INFO ][o.e.c.m.MetadataMappingService] [elom04] [.kibana_7.16.0_reindex_temp/ZWi9vuYxQL2HeYVA0OIiUw] update_mapping [_doc]
[2021-12-09T14:32:39,355][INFO ][o.e.c.m.MetadataMappingService] [elom04] [.kibana_7.16.0_reindex_temp/ZWi9vuYxQL2HeYVA0OIiUw] update_mapping [_doc]
# redacted to keep in forum body limit but repeted a few times
[2021-12-09T14:32:42,523][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] adding block write to indices [[.kibana_7.16.0_reindex_temp/ZWi9vuYxQL2HeYVA0OIiUw]]
[2021-12-09T14:32:42,597][INFO ][o.e.c.m.MetadataIndexStateService] [elom04] completed adding block write to indices [.kibana_7.16.0_reindex_temp]
[2021-12-09T14:32:42,660][INFO ][o.e.c.m.MetadataCreateIndexService] [elom04] applying create index request using existing index [.kibana_7.16.0_reindex_temp] metadata
[2021-12-09T14:32:42,663][INFO ][o.e.c.m.MetadataCreateIndexService] [elom04] [.kibana_7.16.0_001] creating index, cause [clone_index], templates [], shards [1]/[1]

Here a few key informations I collected about my kibana indices in a working setup on 7.15.1. localhost:9201 is refering to one ES master node

$ curl localhost:9201/
{
  "name" : "mymasternode",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "yepmyUUID",
  "version" : {
    "number" : "7.16.0",
    "build_flavor" : "default",
    "build_type" : "deb",
    "build_hash" : "6fc81662312141fe7691d7c1c91b8658ac17aa0d",
    "build_date" : "2021-12-02T15:46:35.697268109Z",
    "build_snapshot" : false,
    "lucene_version" : "8.10.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
$ curl -Ss localhost:9201/_cat/aliases | grep '7.15'
.kibana-event-log-7.15.1      .kibana-event-log-7.15.1-000001           - - - false
.kibana_task_manager_7.15.0   .kibana_task_manager_7.15.0_001           - - - -
.kibana-event-log-7.15.0      .kibana-event-log-7.15.0-000003           - - - true
.kibana-event-log-7.15.0      .kibana-event-log-7.15.0-000001           - - - false
.kibana-event-log-7.15.0      .kibana-event-log-7.15.0-000002           - - - false
.kibana-event-log-7.15.1      .kibana-event-log-7.15.1-000002           - - - true
.kibana                       .kibana_7.15.1_001                        - - - -
.kibana_7.15.1                .kibana_7.15.1_001                        - - - -
.kibana_7.15.0                .kibana_7.15.0_001                        - - - -
.kibana_task_manager          .kibana_task_manager_7.15.1_001           - - - -
.kibana_task_manager_7.15.1   .kibana_task_manager_7.15.1_001           - - - -
$ curl -Ss localhost:9201/_cat/aliases | grep '7.16'
$ curl -Ss localhost:9201/_cat/indices | grep '7.15'
green open  .kibana_task_manager_7.15.0_001                 uFLSDQioThO4SwYHf9vnjA 1 1         17      24  129.2kb   64.6kb
green open  .kibana_task_manager_7.15.1_001                 -oonGAVTRq26yn9UPJBIAA 1 1         17     245    4.5mb  218.8kb
green open  .kibana-event-log-7.15.0-000002                 BXY7AuA9SpKdWvx0shMfCQ 1 1          0       0     434b     226b
green open  .kibana-event-log-7.15.0-000001                 84L-4xOXQgaaGyNhWpw2Qg 1 1          8       0     50kb     25kb
green open  .kibana-event-log-7.15.0-000003                 MN46ykbzSPqMG9rpy85BSw 1 1          0       0     452b     226b
green open  .kibana-event-log-7.15.1-000002                 bI4vG1GiQcCxEzzwRuTFoQ 1 1         16       0   99.5kb   49.7kb
green open  .kibana-event-log-7.15.1-000001                 Ax-kG6KtRrGtLhaees1QvQ 1 1         10       0   39.8kb   19.9kb
green open  .kibana_7.15.0_001                              ua-CFt8RRemIRL4awv4Ncw 1 1       4177     356   24.4mb   12.2mb
green open  .kibana_7.15.1_001                              fxlqME6zQw6xRVDK9CvfTA 1 1       4248    4322   75.4mb   37.6mb
$ curl -Ss localhost:9201/_cat/indices | grep '7.16'
... nothing relevant to kibana

I made sure no kibana were running while trying to upgrade one kibana node. I made sure every item from Upgrade migrations | Kibana Guide [7.16] | Elastic is clear, nor any index templates or lifecycle policies seems to target .kibana indices.

What am I missing here ? The behaviour seems to be the same when upgrading to any kibana version (tried 7.15.2 and 7.16.0 since I'm running 7.15.1 right now) so I guess the root cause reside in the index definitions ?
Please let me known if I can provide any extra information.

Thanks in advance.

Just tried the following in a docker setup;

  • create an ES cluster version 7.16.2
  • reindex .kibana, .kibana_task_manager, .kibana_7.15.1_001 and .kibana_task_manager_7.15.1_001 from my production cluster
  • start a kibana 7.16.2 instance
  • migration success and kibana is usable
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] CLONE_TEMP_TO_TARGET -> REFRESH_TARGET. took: 168ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] REFRESH_TARGET -> OUTDATED_DOCUMENTS_SEARCH_OPEN_PIT. took: 4ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] OUTDATED_DOCUMENTS_SEARCH_OPEN_PIT -> OUTDATED_DOCUMENTS_SEARCH_READ. took: 2ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] OUTDATED_DOCUMENTS_SEARCH_READ -> OUTDATED_DOCUMENTS_SEARCH_CLOSE_PIT. took: 8ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] OUTDATED_DOCUMENTS_SEARCH_CLOSE_PIT -> UPDATE_TARGET_MAPPINGS. took: 2ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] UPDATE_TARGET_MAPPINGS -> UPDATE_TARGET_MAPPINGS_WAIT_FOR_TASK. took: 89ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] UPDATE_TARGET_MAPPINGS_WAIT_FOR_TASK -> MARK_VERSION_INDEX_READY. took: 316ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] MARK_VERSION_INDEX_READY -> DONE. took: 53ms."}
{"type":"log","@timestamp":"2021-12-21T14:54:27+00:00","tags":["info","savedobjects-service"],"pid":1219,"message":"[.kibana] Migration completed after 2521ms"}

So I guess this leaves me with eiter an error in an older index, but the log dont show anything in this direction, or an error in a index template that match kibana indices names ?

hi again, got it working;

GET /_cluster/allocation/explain?pretty showed the new .kibana_7.16.2 index failed with "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes" . Some nodes returned the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=80gb], having less than the minimum required [80gb] free space, actual free: [59.5gb] and other nodes returned source primary is allocated on another node

I did not find any way to tell Elasticsearch "well then, try to assign the shard on a node that have suffisant space as you do with every other index".

Not sure what component was at fault here, anyway I successfully upgraded to kibana 7.16.2.

Hope it helps other lost souls.

1 Like

Thank you for posting.

Cheers,
Bhavya

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.