This is working perfectly! I’ve got a test cluster that I’m in the middle of doing a rolling restart of with no issues:
elasticsearch- "number" : "1.4.0",
elasticsearch- "number" : "1.4.0",
elasticsearch- "number" : "1.3.4",
elasticsearch- "number" : "1.3.4",
elasticsearch- "number" : "1.3.4",
_cluster/health:
"cluster_name" : "elasticsearch",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 5,
"number_of_data_nodes" : 5,
"active_primary_shards" : 1730,
"active_shards" : 3460,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0
I anticipate no other problems finishing this rolling upgrade. Thanks a ton everyone!
From: elasticsearch@googlegroups.com [mailto:elasticsearch@googlegroups.com] On Behalf Of David Pilato
Sent: Monday, November 24, 2014 8:31 AM
To: elasticsearch@googlegroups.com
Subject: Re: 1.4.0 data node can't join existing 1.3.4 cluster
Heya,
We will release aws plugin 2.4.1 in some minutes.
It fixes this rolling upgrade issue.
Note that some WARN messages could appear in old nodes logs until the full rolling upgrade is done.
Thank you all for reporting this!
Le dimanche 23 novembre 2014 03:10:42 UTC+1, Ivan Brusic a écrit :
Great work everyone. Feel better about upgrading now.
On Nov 22, 2014 4:42 PM, "Boaz Leskes" <b.leskes@gmail.commailto:b.leskes@gmail.com> wrote:
Hi Christian, Daniel,
I believe I found the issue - it has to do with the cloud plugins (both AWS and GCE) and the way they create the node list for the unicast based discovery. Effectively they mislead it to think that that all nodes on the cluster are version 1.4.0 which is not correct.
I opened issues for this so it will be corrected soon: https://github.com/elasticsearch/elasticsearch-cloud-aws/issues/143 , https://github.com/elasticsearch/elasticsearch-cloud-gce/issues/41
Cheers,
Boaz
On Saturday, November 22, 2014 7:04:33 PM UTC+2, Jörg Prante wrote:
As said, the change is due to unicast action, which was split in 1.4.0 to an old and a new action, see this commit:
https://github.com/elasticsearch/elasticsearch/commit/e5de47d928582694c7729d199390086983779e6ehttps://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Felasticsearch%2Felasticsearch%2Fcommit%2Fe5de47d928582694c7729d199390086983779e6e&sa=D&sntz=1&usg=AFQjCNFQkgiVz8SfE_dZ5Sa5K7TqYCIQ6g
I am not sure if this is a bug. It seems like a feature to prevent multiple masters by accident.
The strategy as described above by Christian Hedegaard should work, it is still to be considered a work-around:
-
setting up all new 1.4 nodes as not master eligible ("data only")
-
joining them to a 1.3.x cluster while master still is on a 1.3 node should work
-
then, shutting down all 1.3 nodes (except the master) should relocate the shards
-
bringing down the final 1.3 master should "stall" master election (I would also configure a large timeout for master election). This is critical, no index/mapping creations/deletions or cluster state modifying actions should be executed now.
-
adding a 1.4 master eligible node should now overtake the cluster (I would start it with the data folder from the final 1.3 master where the last cluster state is persisted) and the critical phase is over.
-
from then, more 1.4 master eligible nodes should be possible to add
-
finally, the minimum master nodes setting should be configured
Jörg
On Fri, Nov 21, 2014 at 1:56 AM, Christian Hedegaard <chedegaard@red5studios.commailto:chedegaard@red5studios.com> wrote:
FYI, I have found a solution that works (at least for me).
I’ve got a small cluster for testing, only 4 v1.3.5 nodes. What I’ve done is bring up 4X new v1.4.0 nodes as data-only machines. In the yaml I added a line to point the nodes via unicast explicitly to the current master:
discovery.zen.ping.unicast.hosts: ["10.210.9.224:9300http://10.210.9.224:9300"]
When I restarted elasticsearch with that setting, with cloud-aws installed and configured on version 2.4.0, the new nodes found the cluster and properly joined it.
I will now start nuking the old v1.3.5 nodes to migrate the data off of them. Before the final 1.3.5 node is nuked, I will change the config on one of the v1.4.0 nodes to allow it as master and restart it.
I’m not sure if the master stuff is needed or not, but I was very afraid of a split-brain problem. I have another 4-node testing cluster that I will be able to try this upgrade again with in a more controlled manner.
I’m NOT looking forward to upgrading our current production cluster this way (15 data-only nodes, 3 master-only nodes).
So it would appear that the problem is somewhere in the unicast discovery code. The question is who’s to blame? Elasticsearch or the cloud-aws plugin?
From: Boaz Leskes [mailto:b.leskes@gmail.commailto:b.leskes@gmail.com]
Sent: Wednesday, November 19, 2014 2:27 PM
To: elasticsearch@googlegroups.commailto:elasticsearch@googlegroups.com
Cc: Christian Hedegaard
Subject: Re: 1.4.0 data node can't join existing 1.3.4 cluster
Hi Christian,
I'm not sure what thread you refer to exactly, but this shouldn't happen. Can you describe the problem you have some more? Anything in the nodes? (both the 1.4 node and the master)
Cheers,
Boaz
On Wednesday, November 19, 2014 2:39:57 AM UTC+1, Christian Hedegaard wrote:
I found this thread while trying to research the same issue and it looks like there is currently no resolution. We like to keep up on our elasticsearch upgrades as often as possible and do rolling upgrades to keep our clusters up. When testing I’m having the same issue, I cannot add a 1.4.0 box to the existing 1.3.4 cluster.
Is there a fix for this anticipated?
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.commailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5CF8216AA982AF47A8E6DEACA629D22B4EBF409B%40s-us-ex-6.US.R5S.comhttps://groups.google.com/d/msgid/elasticsearch/5CF8216AA982AF47A8E6DEACA629D22B4EBF409B%40s-us-ex-6.US.R5S.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.commailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c6a4d157-8f10-485d-a52d-a6cc192e08ef%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/c6a4d157-8f10-485d-a52d-a6cc192e08ef%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5CF8216AA982AF47A8E6DEACA629D22B4EFC5EA8%40s-us-ex-6.US.R5S.com.
For more options, visit https://groups.google.com/d/optout.