my cluster is consist of 3 instance ip name 15~17
today in the morning. 17 instance was left the cluster
in the 15 instance elasticsearch-head plugin 17 instance stats is "Unassigned" 16 is can not find.
what's happend?
please somebody help me
- 17 instance log message.. in below..
[2014-04-20 03:29:28,539][INFO ][discovery.zen ] [10.32.240.17] master_left [[10.32.240.16] [YL2_5dVaTQ-_3Rvm1yKzoA] [net [/10.32.240.16:21001]]], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2014-04-20 03:29:28,540][INFO ][cluster.service ] [10.32.240.17] master {new [10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]], previous [10.32.240.16][YL2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]]}, removed {[10.32.240.16][Y
L2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]],}, reason: zen-disco-master_failed ([10.32.240.16][YL2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]])
[2014-04-20 03:30:01,320][DEBUG][action.admin.cluster.node.stats] [10.32.240.17] failed to execute on node [a0qNnjLvQSauGEddNxKmNw]
org.elasticsearch.index.engine.EngineClosedException: [jp_listened_calcu_log][0] CurrentState[CLOSED]
-
- instance log message
[2014-04-20 03:27:18,747][INFO ][discovery.zen ] [10.32.240.15] master_left [[10.32.240.16][YL2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]]], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2014-04-20 03:27:18,757][INFO ][cluster.service ] [10.32.240.15] master {new [10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]], previous [10.32.240.16][YL2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]]}, removed {[10.32.240.16][Y
L2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]],}, reason: zen-disco-master_failed ([10.32.240.16][YL2_5dVaTQ-_3Rvm1yKzoA][inet[/10.32.240.16:21001]])
[2014-04-20 03:28:28,544][WARN ][transport ] [10.32.240.15] Received response for a request that has timed out, sent [68787ms] ago, timed out [38787ms] ago, action [discovery/zen/fd/masterPing], node [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][i net[/10.32.240.17:21001]]], id [10310608]
[2014-04-20 03:28:28,544][WARN ][transport ] [10.32.240.15] Received response for a request that has timed out, sent [38787ms] ago, timed out [8787ms] ago, action [discovery/zen/fd/masterPing], node [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][in
et[/10.32.240.17:21001]]], id [10310609]
[2014-04-20 03:28:28,552][INFO ][discovery.zen ] [10.32.240.15] master_left [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]]], reason [no longer master]
[2014-04-20 03:28:28,557][INFO ][cluster.service ] [10.32.240.15] master {new [10.32.240.15][dE_q8O-dT-SeUlTBuM-yiQ][inet[/10.32.240.15:21001]], previous [10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]]}, removed {[10.32.240.17][a
0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]],}, reason: zen-disco-master_failed ([10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]])
[2014-04-20 03:29:28,546][WARN ][discovery.zen ] [10.32.240.15] received cluster state from [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]]] which is also master but with an older cluster_state, telling [[10.32.240.17][a0qNnjL
vQSauGEddNxKmNw][inet[/10.32.240.17:21001]]] to rejoin the cluster
[2014-04-20 03:29:28,548][WARN ][discovery.zen ] [10.32.240.15] failed to send rejoin request to [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]]]
org.elasticsearch.transport.SendRequestTransportException: [10.32.240.17][inet[/10.32.240.17:21001]][discovery/zen/rejoin]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.discovery.zen.ZenDiscovery$7.execute(ZenDiscovery.java:541)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:298)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [10.32.240.17][inet[/10.32.240.17:21001]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:834)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:532)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 7 more
[2014-04-20 03:29:28,603][WARN ][discovery.zen ] [10.32.240.15] received cluster state from [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]]] which is also master but with an older cluster_state, telling [[10.32.240.17][a0qNnjL
vQSauGEddNxKmNw][inet[/10.32.240.17:21001]]] to rejoin the cluster
[2014-04-20 03:29:28,604][WARN ][discovery.zen ] [10.32.240.15] failed to send rejoin request to [[10.32.240.17][a0qNnjLvQSauGEddNxKmNw][inet[/10.32.240.17:21001]]]
org.elasticsearch.transport.SendRequestTransportException: [10.32.240.17][inet[/10.32.240.17:21001]][discovery/zen/rejoin]
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:202)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:173)
at org.elasticsearch.discovery.zen.ZenDiscovery$7.execute(ZenDiscovery.java:541)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:298)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:135)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.elasticsearch.transport.NodeNotConnectedException: [10.32.240.17][inet[/10.32.240.17:21001]] Node not connected
at org.elasticsearch.transport.netty.NettyTransport.nodeChannel(NettyTransport.java:834)
at org.elasticsearch.transport.netty.NettyTransport.sendRequest(NettyTransport.java:532)
at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:189)
... 7 more
~
- instance log message
-
17 instance elasticsearch process is alive
/usr/bin/java -Xms2G -Xmx2G -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -Delasticsearch -Des.path.home=/home/irteam/apps/elasticsearch-0.90.7 -cp :/home/irteam/apps/elasticsearch-0.90.7/lib/elasticsearch-0.90.7.jar:/home/irteam/apps/elasticsearch-0.90.7/lib/:/home/irteam/apps/elasticsearch-0.90.7/lib/sigar/ org.elasticsearch.bootstrap.ElasticSearch
-
configuration
cluster.name: music-es-beta
node.name: 10.32.240.15
http.port: 21200
transport.tcp.port: 21001
multicast.enabled: false
index.number_of_shards: 3
index.number_of_replicas: 1
index.mapper.dynamic: false
action.auto_create_index: false
bootstrap.mlockall: true
discovery.zen.ping.timeout: 10s
index.cache.field.type: soft
discovery.zen.ping.unicast.hosts: ["10.32.240.15", "10.32.240.16","10.32.240.17"] -
how can i consist es-cluster? for fail-over and fail-back