Transient Network Outage and Cluster Health

Kenneth_Loafman_2 · November 5, 2010, 2:00pm

Hi,

We have two clusters that both go to yellow when there is a transient
network outage. This happens overnight mostly, perhaps some form of
maintenance on the cloud providers part. The cluster never recovers the
connection and both require a restart of their secondary nodes. Is there a
setting I need to change in order to keep this from happening? The relevant
part of the config file is:

cloud:

aws:
    access_key: munged
    secret_key: munged
gateway:
type: s3
s3:
bucket: munged
recover_after_nodes: 2

network:
host: host0

discovery.zen.ping.multicast:
enabled: false

discovery.zen.ping.unicast:
hosts: ["host0:9300","host1:9300"]

...Thanks,
...Ken

kimchy · November 5, 2010, 4:38pm

Currently, when a node gets disconnected from a cluster, it requires a
restart in order to rejoin the cluster, it does not join the cluster
automatically. I am working on improving on that... .

For now, maybe just increase the default fault detection timeouts? Check
this:
http://www.elasticsearch.com/docs/elasticsearch/modules/discovery/zen/#Fault_Detection.
What is the message that you get in the log when it gets disconnected?

-shay.banon

On Fri, Nov 5, 2010 at 4:00 PM, Kenneth Loafman kenneth@loafman.com wrote:

Hi,

We have two clusters that both go to yellow when there is a transient
network outage. This happens overnight mostly, perhaps some form of
maintenance on the cloud providers part. The cluster never recovers the
connection and both require a restart of their secondary nodes. Is there a
setting I need to change in order to keep this from happening? The relevant
part of the config file is:

cloud:
aws:
    access_key: munged
    secret_key: munged
gateway:
type: s3
s3:
bucket: munged
recover_after_nodes: 2

network:
host: host0

discovery.zen.ping.multicast:
enabled: false

discovery.zen.ping.unicast:
hosts: ["host0:9300","host1:9300"]
...Thanks,
...Ken

Topic		Replies	Views
(no subject) Elasticsearch	1	190	July 6, 2017
Recovery after network disconnection Elasticsearch	1	258	September 5, 2018
Cluster broke after some network troubles Elasticsearch	4	1680	October 20, 2017
Cluster nodes doesn't reconnect Elasticsearch	4	1806	July 6, 2017
Elastic cluster HA setup Elasticsearch	4	397	May 16, 2018

Transient Network Outage and Cluster Health

Related topics