I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.
I'm using 0.17.4 embedded, the configuration looks like this
It seems like the two nodes ended up not seeing each other properly, thus
each elected itself as the master. If you increase the ping_timeout (it
defaults to 3s) then it should go away. Set discovery.zen.ping.timeout to
something like 10s or 20s.
I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.
I'm using 0.17.4 embedded, the configuration looks like this
If two nodes did participate in a network partition (even on a local
network), and thus end up self-promoting each other to a master
status, what happens when they see each other again?
It seems like the two nodes ended up not seeing each other properly, thus
each elected itself as the master. If you increase the ping_timeout (it
defaults to 3s) then it should go away. Set discovery.zen.ping.timeout to
something like 10s or 20s.
I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.
I'm using 0.17.4 embedded, the configuration looks like this
Nothing, they will remain partitioned, and you will need to decide which one
to restart. The minimum_master_nodes is there to help reduce chances of it
happening. (on a 2 node cluster though, this setting does not mean much).
If two nodes did participate in a network partition (even on a local
network), and thus end up self-promoting each other to a master
status, what happens when they see each other again?
It seems like the two nodes ended up not seeing each other properly, thus
each elected itself as the master. If you increase the ping_timeout (it
defaults to 3s) then it should go away. Set discovery.zen.ping.timeout to
something like 10s or 20s.
I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with
two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.
I'm using 0.17.4 embedded, the configuration looks like this
sorry to bump an old thread, for completeness I just want to confirm
that setting discovery.zen.ping_timeout to 15s works like a charm in my
case.
Many thanks for the quick response,
Pavel
On 9.08.2011 21:22, Shay Banon wrote:
It seems like the two nodes ended up not seeing each other properly,
thus each elected itself as the master. If you increase the
ping_timeout (it defaults to 3s) then it should go away.
Set discovery.zen.ping.timeout to something like 10s or 20s.
Hi,
I'm having some troubles configuring ES in the cloud. Most of the
time everything works, but sometimes the discovery fails and I
endup with two masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.
I'm using 0.17.4 embedded, the configuration looks like this
------------------------------------------------------------
cluster:
name: default-cluster-name
index:
number_of_shards: 2
number_of_replicas: 1
discovery:
type: ec2
zen:
minimum_master_nodes: 1
cloud:
aws:
access_key: XXXXXXXXXX
secret_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Trace logs can be found here https://gist.github.com/1134288. Any
ideas what am I missing?
Thanks in advance,
Pavel
Hi Pavel, that timeout value will often increase based on the number of
non-cluster nodes you have under EC2 management. At least that has been my
experience.
A trick to keep it working well is to make sure that all the nodes that are
in your ElasticSearch cluster are part of the same EC2 group. Then use the ES
groups settinghttp://www.elasticsearch.org/guide/reference/modules/discovery/ec2.htmlto limit those nodes that ES looks for to establish membership.
Two notes on that: You can use ec2 tags as well to filter down the list of
instances needed to be pinged, and, in 0.17, the unicast discovery is
considerably more lightweight compared to previous versions.
Hi Pavel, that timeout value will often increase based on the number of
non-cluster nodes you have under EC2 management. At least that has been my
experience.
A trick to keep it working well is to make sure that all the nodes that are
in your Elasticsearch cluster are part of the same EC2 group. Then use the ES
groups settinghttp://www.elasticsearch.org/guide/reference/modules/discovery/ec2.htmlto limit those nodes that ES looks for to establish membership.
Thanks James, we'll make use of the setting. Indeed the production EC2
environment is quite heterogeneous.
Pavel
On 16.08.2011 22:17, James Cook wrote:
Hi Pavel, that timeout value will often increase based on the number
of non-cluster nodes you have under EC2 management. At least that has
been my experience.
A trick to keep it working well is to make sure that all the nodes
that are in your Elasticsearch cluster are part of the same EC2 group.
Then use the ES groups setting http://www.elasticsearch.org/guide/reference/modules/discovery/ec2.html
to limit those nodes that ES looks for to establish membership.
Hi James (or anybody else with similar experience)
On Tue, 2011-08-16 at 12:17 -0700, James Cook wrote:
Hi Pavel, that timeout value will often increase based on the number
of non-cluster nodes you have under EC2 management. At least that has
been my experience.
A trick to keep it working well is to make sure that all the nodes
that are in your Elasticsearch cluster are part of the same EC2 group.
Then use the ES groups setting to limit those nodes that ES looks for
to establish membership.
Given that getting ES to work well under EC2 seems to present a bit of a
challenge, how would you feel about writing a tutorial for elasticsearch.org?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.