Hi,
I am getting ping timeouts from time to time, recently the master (192.168.5.4)
did not reply fast enough (timeout is 1min), another master has been
elected (192.168.5.2)
and both stayed active.
A status request on the old master gave a NullPointerException, a
status request on the new master
gave a green status with 39 nodes (the old master has been missing,
and did not rejoin the cluster).
A _cluster/state request on the old master, gave as master still the
old one, it did not lose its master status
correctly, but the cluster as a whole seemed to be running, as all
indices were green and the rivers were indexing correctly.
After programmatically closing about 10 indices, the metadata was
messed up. It seems that there are two master nodes working
against each other. The old master updated his old version of the
state and propageted the changes,
resulting in an incorrect state with old information about shards,
which were not on the indicated nodes anymore.
It seems that to old and new master were on a race condition, about
updating the index status, as afterwards some indices have been
closed,
while others remained open. The nodes were monitoring two masters, and
accepted state from both, resulting in a restart
of the fault detection.
Afterwards the nodes started deleting the indices, which were not
assigned on them at the point where the old master lost his master
status.
My current configuration is the following:
es 0.17.10,
40 nodes,
5 master nodes,
minimum master nodes set to 3.
300 indices and 30 rivers.
shards per index: 1, replicas: 1.
unicast discovery and local gateway.
log extract from new masternode when the timeout came: (192.168.5.2)
[2012-01-27 18:08:47,921][DEBUG][zen.fd][main] [Mondo] [master] uses
ping_interval [1s], ping_timeout [1m], ping_retries [3]
[2012-01-27 18:08:47,922][DEBUG][zen.fd][main] [Mondo] [node ] uses
ping_interval [1s], ping_timeout [1m], ping_retries [3]
[2012-01-27 18:10:08,823][DEBUG][zen.fd][elasticsearch[cached]-pool-11-thread-1]
[Mondo] [master] starting fault detection against master [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [initial_join]
[2012-01-28 05:00:33,862][DEBUG][zen.fd][elasticsearch[cached]-pool-11-thread-1047]
[Mondo] [master] failed to ping [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
tried [3] times, each with maximum [1m] timeout
[2012-01-28 05:00:33,863][DEBUG][zen.fd][elasticsearch[cached]-pool-11-thread-1047]
[Mondo] [master] stopping fault detection against master [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [master failure, failed to ping, tried [3] times, each with
maximum [1m] timeout]
[2012-01-28 05:00:33,863][INFO
][discovery.zen][elasticsearch[cached]-pool-11-thread-1044] [Mondo]
master_left [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [failed to ping, tried [3] times, each with maximum [1m]
timeout]
[2012-01-28 05:00:33,864][INFO
][cluster.service][elasticsearch[Mondo]clusterService#updateTask-pool-21-thread-1]
[Mondo] master {new
[Mondo][BTiSBXfRQWqgTI1R19WGSA][inet[/192.168.5.2:9300]]{master=true},
previous [Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}},
removed {[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true},},
reason: zen-disco-master_failed ([Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true})
while closing the indices when the state got mixed up: (192.168.5.2)
[2012-01-30 10:38:08,378][WARN ][discovery.zen][New I/O server worker
#1-7] [Mondo] master should not receive new cluster state from [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}]
[2012-01-30 10:38:08,570][WARN ][discovery.zen][New I/O server worker
#1-7] [Mondo] master should not receive new cluster state from [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}]
[2012-01-30 10:38:08,801][WARN ][discovery.zen][New I/O server worker
#1-7] [Mondo] master should not receive new cluster state from [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}]
on the other nodes it seems that there are 2 master nodes:
[2012-01-27 18:17:40,855][DEBUG][zen.fd][main] [Ev Teel Urizen]
[master] uses ping_interval [1s], ping_timeout [1m], ping_retries [3]
[2012-01-27 18:17:40,857][DEBUG][zen.fd][main] [Ev Teel Urizen] [node
] uses ping_interval [1s], ping_timeout [1m], ping_retries [3]
[2012-01-27 18:18:01,728][DEBUG][zen.fd][elasticsearch[cached]-pool-11-thread-1]
[Ev Teel Urizen] [master] starting fault detection against master
[[Abner Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [initial_join]
[2012-01-27 18:18:07,766][INFO ][discovery][main] [Ev Teel Urizen]
trendictionsearch/a4rRT6_BSNeceIpHf9AVkw
[2012-01-28 05:06:52,403][DEBUG][zen.fd][elasticsearch[cached]-pool-11-thread-832]
[Ev Teel Urizen] [master] failed to ping [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
tried [3] times, each with maximum [1m] timeout
[2012-01-28 05:06:52,404][DEBUG][zen.fd][elasticsearch[cached]-pool-11-thread-832]
[Ev Teel Urizen] [master] stopping fault detection against master
[[Abner Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [master failure, failed to ping, tried [3] times, each with
maximum [1m] timeout]
[2012-01-28 05:06:52,406][DEBUG][zen.fd][elasticsearch[Ev Teel
Urizen]clusterService#updateTask-pool-21-thread-1] [Ev Teel Urizen]
[master] restarting fault detection against master
[[Mondo][BTiSBXfRQWqgTI1R19WGSA][inet[/192.168.5.2:9300]]{master=true}],
reason [possible elected master since master left (reason = failed to
ping, tried [3] times, each with maximum [1m] timeout)]
[2012-01-28 09:26:56,244][DEBUG][zen.fd][elasticsearch[Ev Teel
Urizen]clusterService#updateTask-pool-21-thread-1] [Ev Teel Urizen]
[master] restarting fault detection against master [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [new cluster stare received and we monitor the wrong master
[[Mondo][BTiSBXfRQWqgTI1R19WGSA][inet[/192.168.5.2:9300]]{master=true}]]
[2012-01-28 09:26:59,252][DEBUG][zen.fd][elasticsearch[Ev Teel
Urizen]clusterService#updateTask-pool-21-thread-1] [Ev Teel Urizen]
[master] restarting fault detection against master
[[Mondo][BTiSBXfRQWqgTI1R19WGSA][inet[/192.168.5.2:9300]]{master=true}],
reason [new cluster stare received and we monitor the wrong master
[[Abner Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}]]
[2012-01-28 09:26:59,258][DEBUG][zen.fd][elasticsearch[Ev Teel
Urizen]clusterService#updateTask-pool-21-thread-1] [Ev Teel Urizen]
[master] restarting fault detection against master [[Abner
Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}],
reason [new cluster stare received and we monitor the wrong master
[[Mondo][BTiSBXfRQWqgTI1R19WGSA][inet[/192.168.5.2:9300]]{master=true}]]
[2012-01-28 09:26:59,260][DEBUG][zen.fd][elasticsearch[Ev Teel
Urizen]clusterService#updateTask-pool-21-thread-1] [Ev Teel Urizen]
[master] restarting fault detection against master
[[Mondo][BTiSBXfRQWqgTI1R19WGSA][inet[/192.168.5.2:9300]]{master=true}],
reason [new cluster stare received and we monitor the wrong master
[[Abner Little][-kmiCv3gTpqGqBmsgOBe0A][inet[/192.168.5.4:9300]]{master=true}]]
afterwards on the datanodes themselves, data is being deleted (taking
the wrong state into account):
[2012-01-30 10:50:56,201][DEBUG][indices.store][elasticsearch[Gorilla
Girl]clusterService#updateTask-pool-21-thread-1] [Gorilla Girl]
[search_index_356] deleting index that is no longer in the cluster
meta_date
Wanting to update to es 0.18, I am wondering if the same behaviour is
still present in 0.18.
The question here is that:
- why does the old master not rejoin the cluster correctly (my best
bet would be that he still things he is part of a cluster of which he
still is the master, which would also explain why he updates the
state). - why does the old master not lose its master status, when the other
one is elected? Is the faultdetection behaving correctly when a node
only temporarily goes down under load? - I thought that a node could only see one elected master node, and
that the setting minimum_master_nodes set to 3 when there are 5 master
nodes, would make it impossible for 2 master nodes
to be active at the same time. - why are the timeouts on the ping arriving so regularly, can it be
that it is some threading issue, like it was with the starting of the
rivers after a full cluster restart, as we are using many indices,
many nodes and many rivers.
Also is it possible to not delete the indices locally on the nodes, so
that I might have a chance to recover the deleted indices, by copying
them over to the node were they are expected to be found.
Thanks,
Michel