We’ve recently investigated the pros and cons of upgrading elasticsearch from 2.4 to 5.x on centOS 6.9. Here are some questions I’d like to clarify. Most of them are regarding to whether they are versions compatible, for example, anything that might be deprecated in 5.x.
For bulk index using json-like format file, is there any change from 2.4 to 5.x? Are they compatible with each other?
Is there any data type difference between versions, for example, date timestamp in 2.4 v.s 5.x?
Is there any difference in query syntax, for example, using filter? Especially I am concerned with any functions or syntax are deprecated in 5.x.
Same question applied in question 3, in the context of elasticsearch-py. Any compatibility issue from 2.4 to 5.x in elasticsearch-py?
Indices created in 2.4, can they be read and written in 5.x?
What are major benefits upgrading to 5.x, e.g. speed of index time or query time faster in 5.x than in 2.4?
Is it possible to install 2 versions on the same machine? So I can run some tests first after the migration.
You're generally going to want to have a read over the breaking changes documentation and run the elasticsearch-migration plugin on your 2.4 cluster before proceeding. Many of the questions you've asked are in that documentation and will be flagged by the migration checker as well. For example:
You can run multiple versions on the same machine, but not (at least not easily) if you're installing from deb/rpm. You can run from a few zip/tar packages, just watch out that you set up different ports for them and don't try to form a cluster between the two. Also, FYI we also started releasing docker images if that's more your thing
There's a deprecation log that you can turn on to look for queries that you may be using which are deprecated and may be removed in 5.x. The upgrade assistant can also help you turn this on.
You may want to watch our webinar on upgrading to 5 and has a demo of the migration assistant
As to the question "why upgrade to 5.x," I answered a similar question back when we released 5.0. Since I wrote that, we released 5.1 through 5.6, which also include a huge number of improvements, from cross-cluster search to field collapsing to various optimizations in the query execution like this to new "range" field types you can use to store an entire range of values (IP ranges, date ranges, numeric ranges) and a variety more.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.