Kibana Upgrade from 7.12 to 7.13.3 using migrations.enableV2: false

Hi, We upgraded our kibana from 7.12 to 7.13.3 with migrations.enableV2:
false set in configuration. The kibana instance upgraded successfully
but Now we are getting following error when trying to save searches or
rules

index [.kibana_2] blocked by: [FORBIDDEN/8/index write (api)];: cluster_block_exception

In kibana.log file we are getting the error message

{"type":"log","@timestamp":"2021-09-22T10:54:33+05:00","tags":["error","plugins","taskManager"],"pid":139807,"message":"Failed to poll for work: ResponseError: Response Error"}.

one more thing, after kibana upgrade, this time new index created as
".kibana_task_manager_7.13.3_reindex_temp" and
".kibana_7.13.3_reindex_temp". Furthermore the aliases ".kibana" and
".kibana_task_manager" point to ".kibana_2" and ".kibana_task_manager_2"
respectively. I am not sure whether the kibana upgrade done normally
as latest kibana indexes name are different and give impression of
temporary indexes. please guide me if manually removing the write block
on .kibana_2 and .kibana_task_manager_2 work in our case or not.



Why did you set migrations.enableV2: false? Is this part of an ongoing discussion?

Do you know what caused the cluster block exception?

Hi ,

We try upgrade initially without setting migrations.enableV2: false and it get failed with timeout error. After googling we found a thread which says to get rid off timeout error during kibana upgrade from 7.12 set migrations.enableV2: false in configuration. So we kill kibana instance and set migrations.enableV2: false in configuration and try again. This time upgrade get success and we get Kibana UI where in health dashboard it shows kibana version 7.13.3. but as in normal upgrade the index created with kibana version, This time newly created index name end with reindex_temp. and .kibana and .kibana_task_manager alias does not point to theses newly created indices.
The cause of the cluster block exception is that the index .kibana and .kibana_task_manager is in write block. When we manually remove this write block the reported error get fix.

Right Now we have one ruining instance of kibana and not able to upgrade and add other instances to cluster, as upgrade with setting migrations.enableV2: false does not work. and if we try upgrade with this setting then as soon as we start new kibana instance it put a write block on .kibana and .kibana_task_manager and update get failed in timeout. We need expert advise for way forward.

Right now We have Elasticsearch 7.13.3
kibana 7.13.3 and we have following indices
kibana_7.13.3_reindex_temp
.kibana_task_manager_7.13.3_reindex_temp

but .kibana and .kibana_task_manager point to ".kibana_2" and ".kibana_task_manager_2"

Did you find the reason for the timeouts in the initial upgrade ? It could have been the migrations were taking longer than expected and therefore the timeout errors. Increasing the timeout, memory or performance might allow the migrations to complete.

So what is the way forward now? .
Just to update we are able to upgrade two more instance of kibana setting migrations.enableV2: false. What we observe that each new instance of kibana create a new index with one increment ( for example when we have single instance .kibana alias were pointing to .kibana_2 index. when new kibana instance started it create .kibana_3 index and make .kibana alias point to .kibana_3 index, and same is the case in .kibana_task_manager alias). what making us worried is that we are not sure if this upgrade procedure is correct and we will not face any problem in future upgrades.

I ran into similar problems when upgrading from 7.12.x to 7.13.x due to my fleet agents.

https://github.com/elastic/kibana/issues/95321

Failed migrations seem to create the additional kibana indices, which happened because of the timeouts. When we provisioned a higher spec node with the same configuration and data and did the upgrade again, the non-occurrence of the timeout and successful migration lead to a complete upgrade.

A debug view of the kibana log might indicate the root caust of the timeouts failing the migration.

What you're describing is definitely not the expected behaviour, when two or more Kibana are started on the same version they should use the same index.

Although we don't have enough data to identify the root cause of the migration failing in the first place, the problems you experience after setting migrations.enableV2: false seem to stem from the fact that some of your Kibana instances had this configuration setting set, while others didn't. So it sounds like one Kibana instance was using v2 migrations and the other v1.

You have an unusually large amount of documents in your .kibana and .kibana_task_manager indices so it sounds like you're running into [7.14] SO migration takes too long on pre-7.13 upgrades with heavy alerting usage · Issue #106308 · elastic/kibana · GitHub

If it's possible, you could upgrade to 7.14.0 which fixes this issue, but otherwise:

  1. Shutdown all Kibana instances
  2. Follow the workarounds given in [7.14] SO migration takes too long on pre-7.13 upgrades with heavy alerting usage · Issue #106308 · elastic/kibana · GitHub
  3. Remove migrations.enableV2: false from all your Kibana instances
  4. Start Kibana

If you experience any further problems, please share the complete Kibana logs to help us identify the root cause.

Great, so upgrade to 7.14 will work for us. Do we directly jump to 7.15? Or we first need to upgrade to 7.14 to resolve kibana issue first?

Thanks for your help

You can go directly to 7.15

In general, you can go from any 6.x, 7.x to any higher 7.x