Cluster troubles, Azure related?

We are prepping to launch our app into production and seem to be having
some stability issues. We have a cluster of 4 VMs on Azure that all use the
Azure plugin for discovery. Most of the time it works as expected, but
sometimes it looses its mind. This morning for example, I made adjustments
to the memory allocated to the JVM of all nodes. I rebooted all of the
nodes, one at a time, waiting for a green status before rebooting the next
node. When I rebooted the fourth node, the cluster status turned red (as
per node #1). Node 1 only reported that nodes 1 and 2 were in the cluster.
I waited and nothing changed. I eventually checked the node status on node
3 and found that nodes 3 and 4 had formed their own cluster. I ended up in
a state where nodes 1 and 2 were in a cluster, with 2 being the master,
while 3 and 4 were in a separate cluster, with 3 being the master. I
stopped the elasticsearch service on 3 and 4 and then started the services
up again. They correctly found the cluster of nodes 1 and 2 and all is well
again. Why would this happen, and how can I prevent it from happening? On
node three I found some interesting log reports that I have copied to

Thanks.

Tim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It looks like you did not configure minimum_master_nodes

Jörg

On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell tim.heikell@heapsylon.com
wrote:

We are prepping to launch our app into production and seem to be having
some stability issues. We have a cluster of 4 VMs on Azure that all use the
Azure plugin for discovery. Most of the time it works as expected, but
sometimes it looses its mind. This morning for example, I made adjustments
to the memory allocated to the JVM of all nodes. I rebooted all of the
nodes, one at a time, waiting for a green status before rebooting the next
node. When I rebooted the fourth node, the cluster status turned red (as
per node #1). Node 1 only reported that nodes 1 and 2 were in the cluster.
I waited and nothing changed. I eventually checked the node status on node
3 and found that nodes 3 and 4 had formed their own cluster. I ended up in
a state where nodes 1 and 2 were in a cluster, with 2 being the master,
while 3 and 4 were in a separate cluster, with 3 being the master. I
stopped the elasticsearch service on 3 and 4 and then started the services
up again. They correctly found the cluster of nodes 1 and 2 and all is well
again. Why would this happen, and how can I prevent it from happening? On
node three I found some interesting log reports that I have copied to
https://gist.github.com/theikell/9948b1d318cdc4cd0ecf

Thanks.

Tim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGz8b-W%3DpUk1P1Emuszu%3DrnX5%2BwHca7k9he2B59mogoJg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for the reply Jörg. I have discovery.zen.minimum_master_nodes=2.
Should it be something different?

On Tuesday, September 16, 2014 11:21:16 AM UTC-7, Jörg Prante wrote:

It looks like you did not configure minimum_master_nodes

Jörg

On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell <tim.h...@heapsylon.com
<javascript:>> wrote:

We are prepping to launch our app into production and seem to be having
some stability issues. We have a cluster of 4 VMs on Azure that all use the
Azure plugin for discovery. Most of the time it works as expected, but
sometimes it looses its mind. This morning for example, I made adjustments
to the memory allocated to the JVM of all nodes. I rebooted all of the
nodes, one at a time, waiting for a green status before rebooting the next
node. When I rebooted the fourth node, the cluster status turned red (as
per node #1). Node 1 only reported that nodes 1 and 2 were in the cluster.
I waited and nothing changed. I eventually checked the node status on node
3 and found that nodes 3 and 4 had formed their own cluster. I ended up in
a state where nodes 1 and 2 were in a cluster, with 2 being the master,
while 3 and 4 were in a separate cluster, with 3 being the master. I
stopped the elasticsearch service on 3 and 4 and then started the services
up again. They correctly found the cluster of nodes 1 and 2 and all is well
again. Why would this happen, and how can I prevent it from happening? On
node three I found some interesting log reports that I have copied to
https://gist.github.com/theikell/9948b1d318cdc4cd0ecf

Thanks.

Tim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/53bc6e0c-3110-4f64-90f9-ff0ac84c5ad0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ah, I just found the n/2+1 recommendation, so I expect I need to set it to
3.

On Tuesday, September 16, 2014 11:30:38 AM UTC-7, Tim Heikell wrote:

Thanks for the reply Jörg. I have discovery.zen.minimum_master_nodes=2.
Should it be something different?

On Tuesday, September 16, 2014 11:21:16 AM UTC-7, Jörg Prante wrote:

It looks like you did not configure minimum_master_nodes

Jörg

On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell tim.h...@heapsylon.com
wrote:

We are prepping to launch our app into production and seem to be having
some stability issues. We have a cluster of 4 VMs on Azure that all use the
Azure plugin for discovery. Most of the time it works as expected, but
sometimes it looses its mind. This morning for example, I made adjustments
to the memory allocated to the JVM of all nodes. I rebooted all of the
nodes, one at a time, waiting for a green status before rebooting the next
node. When I rebooted the fourth node, the cluster status turned red (as
per node #1). Node 1 only reported that nodes 1 and 2 were in the cluster.
I waited and nothing changed. I eventually checked the node status on node
3 and found that nodes 3 and 4 had formed their own cluster. I ended up in
a state where nodes 1 and 2 were in a cluster, with 2 being the master,
while 3 and 4 were in a separate cluster, with 3 being the master. I
stopped the elasticsearch service on 3 and 4 and then started the services
up again. They correctly found the cluster of nodes 1 and 2 and all is well
again. Why would this happen, and how can I prevent it from happening? On
node three I found some interesting log reports that I have copied to
https://gist.github.com/theikell/9948b1d318cdc4cd0ecf

Thanks.

Tim

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d2c6462d-8789-4b9f-9776-ea368f7f5661%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9c75e83d-008a-4b05-a62b-23e5e54632d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.