hello guys , i need some advice.
Is it possible configure elasticsearch cluster 2.3.4 to restore connection between nodes that placed in different servers after some network troubles.
i have next configuration:
server1 - node1
cluster.name: name
node.name: node-1
node.master: true
node.data: false
index.number_of_shards: 2
index.number_of_replicas: 1
index.refresh_interval: 15s
threadpool.search.queue_size: 10000
path.logs: /data/logs/elasticsearch/node1
bootstrap.mlockall: true
network.publish_host: server1
network.bind_host: 0
discovery.zen.ping.unicast.hosts: ["server1","server2"]
server1-node2:
cluster.name: eventhandler-main-db
node.name: node-2
node.master: false
node.data: true
index.number_of_shards: 2
index.number_of_replicas: 1
index.refresh_interval: 15s
threadpool.search.queue_size: 10000
path.logs: /data/logs/elasticsearch/node2
bootstrap.mlockall: true
network.publish_host: serveer1
network.bind_host: 0
discovery.zen.ping.unicast.hosts: ["server1", "server2"]
and like this configuration on the second server:server2.
after some network troubles:
i got NodeDisconnectedException:
[2017-09-12 18:02:25,111][INFO ][cluster.service ] [node-2] removed {{node-3}{8W1PsU7zR72UhnKTY_h2gQ}{server2}{server2:9300}{data=false, master=true},}, reason: zen-disco-master_failed (
{node-3}{8W1PsU7zR72UhnKTY_h2gQ}{server2}{server2:9300}{data=false, master=true})
[2017-09-12 18:02:25,127][DEBUG][action.admin.cluster.health] [node-2] connection exception while trying to forward request with action name [cluster:monitor/health] to master node [{node-3}{8W1PsU7zR7
2UhnKTY_h2gQ}{server2}{server2:9300}{data=false, master=true}], scheduling a retry. Error: [org.elasticsearch.transport.NodeDisconnectedException: [node-3][server2:9300][cluster:monitor/health
] disconnected]
[2017-09-12 18:02:25,127][DEBUG][action.admin.cluster.health] [node-2] connection exception while trying to forward request with action name [cluster:monitor/health] to master node [{node-3}{8W1PsU7zR7
2UhnKTY_h2gQ}{server2}{server2:9300}{data=false, master=true}], scheduling a retry. Error: [org.elasticsearch.transport.NodeDisconnectedException: [node-3][server2:9300][cluster:monitor/health
] disconnected]
[2017-09-12 18:02:25,128][DEBUG][action.admin.cluster.state] [node-2] connection exception while trying to forward request with action name [cluster:monitor/state] to master node [{node-3}{8W1PsU7zR72U
hnKTY_h2gQ}{server2}{server2:9300}{data=false, master=true}], scheduling a retry. Error: [org.elasticsearch.transport.NodeDisconnectedException: [node-3][server2:9300][cluster:monitor/state] d
isconnected]
[2017-09-12 18:02:25,130][DEBUG][action.admin.cluster.health] [node-2] timed out while retrying [cluster:monitor/health] after failure (timeout [30s])
NodeDisconnectedException[[node-3][server2:9300][cluster:monitor/health] disconnected]
[2017-09-12 18:02:29,524][INFO ][cluster.service ] [node-2] detected_master {node-1}{F1CvVKftTnmRezujp7Werw}{server1}{server1:9300}{data=false, master=true}, added {{node-1}{F1CvVKftTnmR
ezujp7Werw}{server1}{server1:9300}{data=false, master=true},}, reason: zen-disco-receive(from master [{node-1}{F1CvVKftTnmRezujp7Werw}{server1}{server1:9300}{data=false, master=true}])
[2017-09-12 18:03:29,485][WARN ][transport ] [node-2] Received response for a request that has timed out, sent [58961ms] ago, timed out [28961ms] ago, action [internal:discovery/zen/fd/m
aster_ping], node [{node-1}{F1CvVKftTnmRezujp7Werw}{server1}{server1:9300}{data=false, master=true}], id [2706]
[2017-09-12 18:03:59,575][INFO ][cluster.service ] [node-2] removed {{node-4}{TthpIEPcSBCE3Irzzhblvw}{server2}{server2:9301}{master=false},}, reason: zen-disco-receive(from master [{node
-1}{F1CvVKftTnmRezujp7Werw}{server1}{server1:9300}{data=false, master=true}])
[2017-09-12 18:04:00,123][DEBUG][action.search ] [node-2] Node [TthpIEPcSBCE3Irzzhblvw] not available for scroll request [cXVlcnlUaGVuRmV0Y2g7MjsyMjpUdGhwSUVQY1NCQ0UzSXJ6emhibHZ3OzIzOlR0aHBJ
RVBjU0JDRTNJcnp6aGJsdnc7MDs=]
[2017-09-12 18:04:00,123][DEBUG][action.search ] [node-2] Node [TthpIEPcSBCE3Irzzhblvw] not available for scroll request [cXVlcnlUaGVuRmV0Y2g7MjsyMjpUdGhwSUVQY1NCQ0UzSXJ6emhibHZ3OzIzOlR0aHBJ
RVBjU0JDRTNJcnp6aGJsdnc7MDs=]
[2017-09-12 18:04:01,385][DEBUG][action.admin.cluster.node.info] [node-2] failed to execute on node [TthpIEPcSBCE3Irzzhblvw]
NodeDisconnectedException[[node-4][server2:9301][cluster:monitor/nodes/info[n]] disconnected]
[2017-09-12 18:04:01,387][WARN ][action.index ] [node-2] [events-2017.09.12][1] failed to perform indices:data/write/index[r] on node {node-4}{TthpIEPcSBCE3Irzzhblvw}{server2}{server2
:9301}{master=false}
NodeDisconnectedException[[node-4][server2:9301][indices:data/write/index[r]] disconnected]
[2017-09-12 18:04:01,410][WARN ][action.index ] [node-2] [events-2017.09.12][1] failed to perform indices:data/write/index[r] on node {node-4}{TthpIEPcSBCE3Irzzhblvw}{server2}{server2
:9301}{master=false}
NodeDisconnectedException[[node-4][server2:9301][indices:data/write/index[r]] disconnected]
[2017-09-12 18:04:01,387][WARN ][action.index ] [node-2] [events-2017.09.12][0] failed to perform indices:data/write/index[r] on node {node-4}{TthpIEPcSBCE3Irzzhblvw}{server2}{server2
:9301}{master=false}
workaround is to restart master node on one of the 2 servers and cluster is restore.
Is it possible automize to auto restore? Some ways?