Cloud AWS plugin intermittent problem ES 1.7 AWS plugin 2.7.1

bvoros · August 1, 2016, 1:08pm

Hello All,

I am using the AWS plugin (ver 2.7.1) with ES 1.7.5 in Amazon for cluster discovery.
All is well most of the time.

Sometimes however it simply does not work and and there is no indication why.

Scenario 1.
New cluster is launched using a script, the instances get the relevant role assigned.
When the cluster stands up, you get 503. Then these instances are terminated, and the cluster is launched again using the same script. Second/Third time the cluster comes up and all is well. No changes made. Sometimes this succeeds the first time.

Scenario 2.
A change was made to elasticsearch.yml on each node of a running cluster. The cluster was launched with the script mentioned above. After having restarted each node one by one while keeping an eye on cluster status.
One node is just unable to discover the rest of the cluster using the AWS plugin. Again, this is a node that was joined into this cluster using the plugin before.

Has anyone come across this before?

Thanks in advance,

Extarct from the log:
[2016-08-01 12:27:52,888][INFO ][node ] [ip-10-20-xxx-xxx] initializing ...
[2016-08-01 12:27:53,055][INFO ][plugins ] [ip-10-20-xxx-xxx] loaded [cloud-aws], sites []
[2016-08-01 12:27:53,107][INFO ][env ] [ip-10-20-xxx-xxx] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [119.1gb], net total_space [125.8gb], types [ext4]
[2016-08-01 12:27:56,716][INFO ][node ] [ip-10-20-xxx-xxx] initialized
[2016-08-01 12:27:56,716][INFO ][node ] [ip-10-20-xxx-xxx] starting ...
[2016-08-01 12:27:56,842][INFO ][transport ] [ip-10-20-xxx-xxx] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.20.xxx.xxx:9300]}
[2016-08-01 12:27:56,899][INFO ][discovery ] [ip-10-20-xxx-xxx] thisEsCluster/HOO-XTFrSduT5uirb-xBGA
[2016-08-01 12:28:26,899][WARN ][discovery ] [ip-10-20-xxx-xxx] waited for 30s and no initial state was set by the discovery
[2016-08-01 12:28:26,904][INFO ][http ] [ip-10-20-xxx-xxx] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.20.xxx.xxx:9200]}
[2016-08-01 12:28:26,904][INFO ][node ] [ip-10-20-xxx-xxx] started
[2016-08-01 12:39:53,738][DEBUG][action.admin.cluster.health] [ip-10-20-xxx-xxx] no known master node, scheduling a retry
[2016-08-01 12:40:23,740][DEBUG][action.admin.cluster.health] [ip-10-20-xxx-xxx] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]
[2016-08-01 12:40:23,934][DEBUG][action.admin.indices.get ] [ip-10-20-xxx-xxx] no known master node, scheduling a retry
[2016-08-01 12:40:53,934][DEBUG][action.admin.indices.get ] [ip-10-20-xxx-xxx] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s]

bvoros · August 2, 2016, 12:04pm

Found the cause of the problem.
Typo, as always.
The discovery statements in my elasticsearch.yml were missing the ec2 bit.
As a result, discovery was running with default values which worked in AWS EU but not in Oregon.

Incorrect yml:
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
cluster.name: thisEsCluster
node.name: ${HOSTNAME}
cloud.aws.region: eu-west
discovery.type: ec2
discovery.groups: elasticsearchCluster
discovery.ping_timeout: 60s
cloud.node.auto_attributes: true
discovery.zen.minimum_master_nodes: 2

Correct yml:
bootstrap.mlockall: true
discovery.zen.ping.multicast.enabled: false
cluster.name: thisEsCluster
node.name: ${HOSTNAME}
cloud.aws.region: eu-west
discovery.ec2.type: ec2
discovery.ec2.groups: elasticsearchCluster
discovery.ec2.ping_timeout: 60s
cloud.node.auto_attributes: true
discovery.zen.minimum_master_nodes: 2

Topic		Replies	Views
ES and AWS Cloud plugin autodiscovery not working Elasticsearch	3	408	July 6, 2017
AWS nodes don't discover each other Elasticsearch	10	456	July 6, 2017
Cloud-aws plugin not able to join cluster Elasticsearch	13	3770	July 5, 2017
Major cloud-aws plugin issues Elasticsearch	1	363	July 6, 2017
Can't get Nodes to join AWS cluster Elasticsearch	3	579	July 6, 2017

Cloud AWS plugin intermittent problem ES 1.7 AWS plugin 2.7.1

Related topics