I've created a snapshot following the advice of z0z0, which I found to be simpler to follow and more informative than the official documentation.
Having then had a look inside /etc/elasticsearch/backup, I'm assuming that the snapshot contains data for both the indexes and the data — is that correct?
When I upgrade to 6.x, I would then have to create new indexes — how do I migrate the data to the new indexes, or do I have to re-index?
In the development environment, I'm running Elasticsearch under Homebrew, which performed a silent upgrade from 5.6 to 6.2 without intervention from me — what's the difference between that and the rolling upgrade process?
The linked article was written a long time ago, prior to the release of Elasticsearch 5.0.0, and is out of date:
This is a very easy method if you want to migrate your current elasticsearch cluster to a new version, which cannot be performed on major upgrades, and you don't want to loose any data.
This is no longer true: you can upgrade from 5.6.9 to 6.2.3 via a rolling upgrade without needing to perform any snapshot or restore as the article describes.
Ok, I think the recommendation here is to take a backup just in case something goes wrong: you can't roll back once you've started to upgrade. As long as you follow the rest of the steps (particularly resolving all deprecation warnings) then your data will carry across.
Hi @DavidTurner, when Homebrew upgraded from 5.6 to 6 the application was broken due to the use of the type field, which I had to fix with a revision to the indexes.
So, assuming the data remains, I would need to add the revised indexes — how would that affect data?
I think there might be a terminology problem here: in Elasticsearch, an index is the thing that contains your data. This is different from typical SQL database terminology in which tables contain data and indices are auxiliary data structures that are just there to make queries more efficient. By "your data will carry across" I meant that any indices created in 5.x will continue to work in 6.x.
There are a number of breaking changes in 6.0 and you need to make sure that your application is compatible with the new version before upgrading your production system.
Hi @DavidTurner, let me rephrase the question, because I feel as though this going around in circles.
Elasticsearch was upgraded to 6 in the background in development, and afterwards stopped working because the schema contained type fields, which — as explained in the breaking changes — is deprecated. I've since fixed this with the new schema and a re-indexing.
What I need to know is, if I do a scheduled upgrade, from an index that is based on the same index structure and schema as the development version (5.6), would that also become broken (requiring a re-index), or would there be some process to use the new schema?
I too am thinking we're missing something in this conversation, but not yet sure what.
The way we expect an upgrade from 5.6.9 to 6.2.4 to happen is
upgrade all client applications not to use any features that were deprecated in 5.6.9.
upgrade Elasticsearch via a rolling upgrade.
These steps do not need to happen simultaneously: by design, it should be possible to modify client applications to be compatible with 6.2.4 while they are still running against 5.6.9.
If your application is trying to create new indices with multiple types then this is a feature that was deprecated in 5.x, and you need to upgrade the application to only create single-type indices before upgrading Elasticsearch. You can do this while still running on 5.x, because of course 5.x supports single-type indices too as a special case of multiple-type indices. Once the application is so upgraded, any existing indices will remain as they are through the upgrade, and you can still index into them and search them, without needing to reindex them into the 6.x format.
It's wise to fully rehearse this upgrade in your test environment before performing it in production. This means starting with the same versions of your application and Elasticsearch as you have in production, and running through the upgrade procedure, verifying that everything still works afterwards. If it doesn't, the best thing to do is to reset the test environment to match production and try again with a new procedure. You mentioned using Homebrew to upgrade one of your environments. If this is not how you will upgrade your production environment then it's not a good test of your upgrade procedure.
We're unlikely ever to be able to say "yes, this upgrade definitely won't break your production environment" because there are far too many variables outside of our control. We may be able to offer help with your upgrade procedure if you cannot find a way to do the upgrade without reindexing, but this may require some iteration: in particular, you'll need to be able to reset your test environment to match production and try the upgrade again.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.