I'm upgrading an application to use 5.6.2 rather than 2.4.1 and I'm trying to change our design a bit that moved from PoC to production a few years ago & never changed.
So essentially in the existing system we a have a one node cluster. The index is built in the target folder of a java app using NodeClient. We then stop ElasticSearch on the test/UAT/live server and copy the index over to refresh the data. This has some advantages - atomic and separate until it has passed verification, and some disadvantages - manual, requires taking the service down.
My plan is to rewrite the java app using the ElasticSearch Java Rest Client and create the new index on a separate cluster (this may evolve to being an alias instead I think but baby steps...). In this way I'm hoping that I will be able to slowly evolve the application to a more automated deployment (they will always want to view the new index before deployment I think so I don't think we'll ever evolve to a live update situation).
I would be interested if anyone has any reassurance or negative comments about this as a strategy but my main question is, sadly, a lot more basic. In the old version the cluster was stored in its own folder in the data directory, so something like
E:\ES_HOME\data\appdatadev on dev,
E:\ES_HOME\data\appdatauat on UAT,
E:\ES_HOME\data\appdata on live,
etc.
The first thing I had to do when upgrading to 5.6.2 was to remove this named cluster folder so that the nodes folder is in data ->
E:\ES_HOME\data\nodes
(as apposed to E:\ES_HOME\data\appdatadev\nodes for example).
I'm now struggling to find a reference to how I have to separate clusters running on the same instance ? I'm sure I read a reference to this a couple of days ago when I set it up but I can't find it now.
As far as I know, nodes join clusters based on the cluster name. So, as long as you name the cluster differently in the elasticsearch config file, you should be fine
Thanks both. I've not done too much with this yet as I've realised I need to move from the TransportClient to the High & Low level Rest Clients. I think the param answer is more suited as once I work out how to send the param (currently getting unknown setting which is a bit irritating!) I think this is the more easily configurable.
Apologies I've wandered into a bit of a mess as I thought I was tidying up some java code and have slowly moved to the idea of a complete rewrite as I thought I've move from 2.4 to 5.6 while I was there. You've give me a massive amount of help on a different thread and I as a result I think I was getting a bit twisted up between old and new approaches (current working code xml config, initial attempt code config with TransportClient, now low & high level rest client).
FYI I was doing this in various forms (with a file object, relative & absolute path I think)
To be honest I was taking this path when I thought I was making minor mods to the elasticSearch part of the code (just moving it from xml config to code config) but now I know it is such a major re-working I think I will take a step back and try and choose the best way forward for building a replacement 'index-set'/cluster first (aliases, new indexes or all indexes under a new cluster), before pushing forward. Probably out of the scope of this question!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.