General Guidance on Updating ELK with minimum Fuss

I might be missing something but ELK updates (even minor versions like 7.6.2 to 7.7) can be a real pain with much downtime. Is there a guide which explains how to upgrade?

We generally download the latest version, copy data and config directories across and start the services.

Today we updated ElasticSearch and Kibana from 7.6.2 to 7.7 and ended up with all our indices red.

It seems apparent that ES is doing something to the indices as it's log is running full and the list of red indices is decreasing.

We are seeing throttling in allocation: although we only have one node and every index has 1 shard and no replicas. Kibana is running but the logs indicate it's hitting a not-quite-ready elasticsearch.

Was there some update process we missed which would save us downtime?

There is NOTHING in the elasticsearch log about why the node is red.

[2020-05-15T14:34:09,182][WARN ][r.suppressed             ] [ecom-repository01] path: /.kibana/_doc/space%3Adefault, params: {index=.kibana, id=space:default}
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
	at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:189) ~[elasticsearch-7.7.0.jar:7.7.0]
[2020-05-15T15:04:11,241][WARN ][o.e.x.m.e.l.LocalExporter] [ecom-repository01] unexpected error while indexing monitoring document
org.elasticsearch.xpack.monitoring.exporter.ExportException: UnavailableShardsException[[.monitoring-es-7-2020.05.15][0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[.monitoring-es-7-2020.05.15][0]] containing [414] requests]]
	at org.elasticsearch.xpack.monitoring.exporter.local.LocalBulk.lambda$throwExportException$2(LocalBulk.java:125) ~[x-pack-monitoring-7.7.0.jar:7.7.0]
Caused by: org.elasticsearch.action.search.SearchPhaseExecutionException: Search rejected due to missing shards [[.kibana_task_manager_2][0]]. Consider using `allow_partial_search_results` setting to bypass this error.
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.run(AbstractSearchAsyncAction.java:196) ~[elasticsearch-7.7.0.jar:7.7.0]

Hi @cawoodm,

I run ES on Debian Linux and have the configuration Puppet managed. The ES service is not restarted on config change.

I can upgrade the ES package and do a rolling restart of the whole cluster with no down time.

I have also shard allocation awareness configured which lets me restart groups of nodes instead of one node at a time when doing a rolling restart.

I never move the data.

P.S. I only now read that you only have one node... That will probably cause at least brief RED at upgrades...

Yes, a step-by-step guide through the proper procedure for a zero-downtime upgrade is in the reference documentation.

That sounds like a mistake, you don't need to move or copy the data directory to upgrade.

Oh, in which case you cannot avoid downtime. The minimum-fuss method requires at least three nodes.

Here's a blog post that might help:

1 Like

OK, so there is no official guide on how to upgrade if we only have one node?

Our strategy so far was to:

  • Stop the service running in directory`elasticsearch.v.old'
  • Copy config and data across to directory`elasticsearch.v.new'
  • Fire up `elasticsearch.v.new' and see what happens
  • In case of problems, simply go back to running`elasticsearch.v.old'

Yes, there's a documented procedure for upgrading by restarting the whole cluster too. It involves downtime, however, there's no way around that if you only have one node.

That is one painful upgrade process you got there son.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.