EC2 discovery leads to two masters

Pavel_Penchev · August 9, 2011, 3:19pm

Hi,

I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.

I'm using 0.17.4 embedded, the configuration looks like this

cluster:
name: default-cluster-name

index:
number_of_shards: 2
number_of_replicas: 1

discovery:
type: ec2
zen:
minimum_master_nodes: 1

cloud:
aws:
access_key: XXXXXXXXXX
secret_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Trace logs can be found here https://gist.github.com/1134288. Any ideas
what am I missing?

Thanks in advance,
Pavel

kimchy · August 9, 2011, 6:22pm

It seems like the two nodes ended up not seeing each other properly, thus
each elected itself as the master. If you increase the ping_timeout (it
defaults to 3s) then it should go away. Set discovery.zen.ping.timeout to
something like 10s or 20s.

On Tue, Aug 9, 2011 at 6:19 PM, Pavel Penchev pavel.penchev@gmail.comwrote:

Hi,

I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.

I'm using 0.17.4 embedded, the configuration looks like this

cluster:
name: default-cluster-name

index:
number_of_shards: 2
number_of_replicas: 1

discovery:
type: ec2
zen:
minimum_master_nodes: 1

cloud:
aws:
access_key: XXXXXXXXXX
secret_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Trace logs can be found here https://gist.github.com/1134288. Any ideas
what am I missing?

Thanks in advance,
Pavel

jjasinek · August 9, 2011, 8:06pm

Shay,

If two nodes did participate in a network partition (even on a local
network), and thus end up self-promoting each other to a master
status, what happens when they see each other again?

Jason

On Aug 9, 1:22 pm, Shay Banon kim...@gmail.com wrote:

It seems like the two nodes ended up not seeing each other properly, thus
each elected itself as the master. If you increase the ping_timeout (it
defaults to 3s) then it should go away. Set discovery.zen.ping.timeout to
something like 10s or 20s.

On Tue, Aug 9, 2011 at 6:19 PM, Pavel Penchev pavel.penc...@gmail.comwrote:

Hi,

I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.

I'm using 0.17.4 embedded, the configuration looks like this

cluster:
name: default-cluster-name

index:
number_of_shards: 2
number_of_replicas: 1

discovery:
type: ec2
zen:
minimum_master_nodes: 1

cloud:
aws:
access_key: XXXXXXXXXX
secret_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Trace logs can be found herehttps://gist.github.com/1134288. Any ideas
what am I missing?

Thanks in advance,
Pavel

kimchy · August 9, 2011, 8:34pm

Nothing, they will remain partitioned, and you will need to decide which one
to restart. The minimum_master_nodes is there to help reduce chances of it
happening. (on a 2 node cluster though, this setting does not mean much).

On Tue, Aug 9, 2011 at 11:06 PM, jjasinek jjasinek@gmail.com wrote:

Shay,

If two nodes did participate in a network partition (even on a local
network), and thus end up self-promoting each other to a master
status, what happens when they see each other again?

Jason

On Aug 9, 1:22 pm, Shay Banon kim...@gmail.com wrote:

It seems like the two nodes ended up not seeing each other properly, thus
each elected itself as the master. If you increase the ping_timeout (it
defaults to 3s) then it should go away. Set discovery.zen.ping.timeout to
something like 10s or 20s.

On Tue, Aug 9, 2011 at 6:19 PM, Pavel Penchev <pavel.penc...@gmail.com
wrote:

Hi,

I'm having some troubles configuring ES in the cloud. Most of the time
everything works, but sometimes the discovery fails and I endup with
two
masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.

I'm using 0.17.4 embedded, the configuration looks like this

cluster:
name: default-cluster-name

index:
number_of_shards: 2
number_of_replicas: 1

discovery:
type: ec2
zen:
minimum_master_nodes: 1

cloud:
aws:
access_key: XXXXXXXXXX
secret_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Trace logs can be found herehttps://gist.github.com/1134288. Any ideas
what am I missing?

Thanks in advance,
Pavel

Pavel_Penchev · August 16, 2011, 9:28am

Hi,

sorry to bump an old thread, for completeness I just want to confirm
that setting discovery.zen.ping_timeout to 15s works like a charm in my
case.

Many thanks for the quick response,
Pavel

On 9.08.2011 21:22, Shay Banon wrote:

It seems like the two nodes ended up not seeing each other properly,
thus each elected itself as the master. If you increase the
ping_timeout (it defaults to 3s) then it should go away.
Set discovery.zen.ping.timeout to something like 10s or 20s.

On Tue, Aug 9, 2011 at 6:19 PM, Pavel Penchev <pavel.penchev@gmail.com
mailto:pavel.penchev@gmail.com> wrote:
Hi,

I'm having some troubles configuring ES in the cloud. Most of the
time everything works, but sometimes the discovery fails and I
endup with two masters using the same cluster name.
The situation happens on roughly 1 out of 10 startups.

I'm using 0.17.4 embedded, the configuration looks like this
------------------------------------------------------------
cluster:
    name: default-cluster-name

index:
    number_of_shards: 2
    number_of_replicas: 1

discovery:
    type: ec2
    zen:
        minimum_master_nodes: 1

cloud:
    aws:
        access_key: XXXXXXXXXX
        secret_key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


Trace logs can be found here https://gist.github.com/1134288. Any
ideas what am I missing?

Thanks in advance,
Pavel

James_Cook · August 16, 2011, 7:17pm

Hi Pavel, that timeout value will often increase based on the number of
non-cluster nodes you have under EC2 management. At least that has been my
experience.

A trick to keep it working well is to make sure that all the nodes that are
in your ElasticSearch cluster are part of the same EC2 group. Then use the ES
groups settinghttp://www.elasticsearch.org/guide/reference/modules/discovery/ec2.htmlto limit those nodes that ES looks for to establish membership.

kimchy · August 16, 2011, 11:42pm

Two notes on that: You can use ec2 tags as well to filter down the list of
instances needed to be pinged, and, in 0.17, the unicast discovery is
considerably more lightweight compared to previous versions.

On Tue, Aug 16, 2011 at 10:17 PM, James Cook jcook@tracermedia.com wrote:

Hi Pavel, that timeout value will often increase based on the number of
non-cluster nodes you have under EC2 management. At least that has been my
experience.

A trick to keep it working well is to make sure that all the nodes that are
in your Elasticsearch cluster are part of the same EC2 group. Then use the ES
groups settinghttp://www.elasticsearch.org/guide/reference/modules/discovery/ec2.htmlto limit those nodes that ES looks for to establish membership.

Pavel_Penchev · August 18, 2011, 7:50am

Thanks James, we'll make use of the setting. Indeed the production EC2
environment is quite heterogeneous.

Pavel

On 16.08.2011 22:17, James Cook wrote:

Hi Pavel, that timeout value will often increase based on the number
of non-cluster nodes you have under EC2 management. At least that has
been my experience.

A trick to keep it working well is to make sure that all the nodes
that are in your Elasticsearch cluster are part of the same EC2 group.
Then use the ES groups setting
http://www.elasticsearch.org/guide/reference/modules/discovery/ec2.html
to limit those nodes that ES looks for to establish membership.

Clinton_Gormley · August 19, 2011, 8:56am

Hi James (or anybody else with similar experience)

On Tue, 2011-08-16 at 12:17 -0700, James Cook wrote:

Hi Pavel, that timeout value will often increase based on the number
of non-cluster nodes you have under EC2 management. At least that has
been my experience.

A trick to keep it working well is to make sure that all the nodes
that are in your Elasticsearch cluster are part of the same EC2 group.
Then use the ES groups setting to limit those nodes that ES looks for
to establish membership.

Given that getting ES to work well under EC2 seems to present a bit of a
challenge, how would you feel about writing a tutorial for
elasticsearch.org?

It would be an invaluable resource.

clint

James_Cook · August 19, 2011, 12:59pm

I think that would be useful as well. I'll try to carve out some time to get
something started.

James_Cook · August 19, 2011, 1:01pm

And Clinton, a cookbook of search recipes would be awesome to see on a web
page.

You have solved many gotchas for people over the past months.

Clinton_Gormley · August 19, 2011, 1:04pm

On Fri, 2011-08-19 at 06:01 -0700, James Cook wrote:

And Clinton, a cookbook of search recipes would be awesome to see on a
web page.

touchÃ©

Topic		Replies	Views
How to create elastic search cluster on AWS-EC2 using ES-6.5.2 Elasticsearch	18	737	January 10, 2019
Problem with Zen discovery Elasticsearch	2	366	July 6, 2017
Two nodes cluster in 2.1.1 fails for high availability Elasticsearch	8	1189	July 5, 2017
Master Not Discovered (V: 6.1.0) Elasticsearch	3	843	January 23, 2018
AWS EC2 Discovery : masters nodes work but data nodes fail Elasticsearch	1	732	July 5, 2017

EC2 discovery leads to two masters

I'm using 0.17.4 embedded, the configuration looks like this

I'm using 0.17.4 embedded, the configuration looks like this

I'm using 0.17.4 embedded, the configuration looks like this

I'm using 0.17.4 embedded, the configuration looks like this

Related topics