Nodes disconnect without apparent reason

Hi,

I have a 12 node cluster (3 master, 4 data, 5 client) running on AWS.
I am using the cloud-aws plugin.

Currently I am running 1.5.0, but this has been happening on 1.4.3 as well
(not at all before 1.4.3)

Every once in a while, a node (data node most of the time) appears to get
disconnected ("node-xxx left") then re-connects ("node-xxx joined"), mostly
within 1-3 minutes.
This entails the cluster status being YELLOW for the time of of disconnect
plus the time it takes to validate all the shards.

Has anyone else been experiencing this? Does anyone have an idea how to
resolve this issue?

Thanks
Noni.

Cluster Health: https://gist.github.com/noniperi/76c61a054063c460c1d2
Cluster Logs https://gist.github.com/7ba2fafe13676a1d027d

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Are you running across regions?

On 5 April 2015 at 23:27, Noni Peri noniperi@gmail.com wrote:

Hi,

I have a 12 node cluster (3 master, 4 data, 5 client) running on AWS.
I am using the cloud-aws plugin.

Currently I am running 1.5.0, but this has been happening on 1.4.3 as well
(not at all before 1.4.3)

Every once in a while, a node (data node most of the time) appears to get
disconnected ("node-xxx left") then re-connects ("node-xxx joined"), mostly
within 1-3 minutes.
This entails the cluster status being YELLOW for the time of of disconnect
plus the time it takes to validate all the shards.

Has anyone else been experiencing this? Does anyone have an idea how to
resolve this issue?

Thanks
Noni.

Cluster Health: https://gist.github.com/noniperi/76c61a054063c460c1d2
Cluster Logs https://gist.github.com/7ba2fafe13676a1d027d

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-FA2sURnXzBkmDN9xOaZXeDYazhzN0vFWbBQoP-NqsEw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I am not running across regions, but I am running on 2 availability zones
in the same region.
I have 2 data nodes in each region, and each shard is replicated exactly
twice, once in each region.

On Monday, April 6, 2015 at 3:19:25 AM UTC+3, Mark Walkom wrote:

Are you running across regions?

On 5 April 2015 at 23:27, Noni Peri <noni...@gmail.com <javascript:>>
wrote:

Hi,

I have a 12 node cluster (3 master, 4 data, 5 client) running on AWS.
I am using the cloud-aws plugin.

Currently I am running 1.5.0, but this has been happening on 1.4.3 as
well (not at all before 1.4.3)

Every once in a while, a node (data node most of the time) appears to get
disconnected ("node-xxx left") then re-connects ("node-xxx joined"), mostly
within 1-3 minutes.
This entails the cluster status being YELLOW for the time of of
disconnect plus the time it takes to validate all the shards.

Has anyone else been experiencing this? Does anyone have an idea how to
resolve this issue?

Thanks
Noni.

Cluster Health: https://gist.github.com/noniperi/76c61a054063c460c1d2
Cluster Logs https://gist.github.com/7ba2fafe13676a1d027d

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/23d811f8-1e0b-433e-a7b7-46b89c2276c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Can you take a look at https://github.com/elastic/elasticsearch/issues/10447
and see if you can replicate it the next time this happens?

On 6 April 2015 at 16:09, Noni Peri noniperi@gmail.com wrote:

I am not running across regions, but I am running on 2 availability zones
in the same region.
I have 2 data nodes in each region, and each shard is replicated exactly
twice, once in each region.

On Monday, April 6, 2015 at 3:19:25 AM UTC+3, Mark Walkom wrote:

Are you running across regions?

On 5 April 2015 at 23:27, Noni Peri noni...@gmail.com wrote:

Hi,

I have a 12 node cluster (3 master, 4 data, 5 client) running on AWS.
I am using the cloud-aws plugin.

Currently I am running 1.5.0, but this has been happening on 1.4.3 as
well (not at all before 1.4.3)

Every once in a while, a node (data node most of the time) appears to
get disconnected ("node-xxx left") then re-connects ("node-xxx joined"),
mostly within 1-3 minutes.
This entails the cluster status being YELLOW for the time of of
disconnect plus the time it takes to validate all the shards.

Has anyone else been experiencing this? Does anyone have an idea how to
resolve this issue?

Thanks
Noni.

Cluster Health: https://gist.github.com/noniperi/76c61a054063c460c1d2
Cluster Logs https://gist.github.com/7ba2fafe13676a1d027d

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/47b552b6-a298-4cbc-99a8-7ff2560b18a2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/23d811f8-1e0b-433e-a7b7-46b89c2276c1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/23d811f8-1e0b-433e-a7b7-46b89c2276c1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_iODkDz92pMwNFvECZQcoU4m%3DFZnH75PjeNds5XUC0NA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.