ElasticSearch EC2 autodiscovery apparently flaky

I have setup an elasticsearch cluster using cloud-aws and it appears that
randomly nodes will not join the cluster; I don't see any error messages,
just the message that nothing was received in ~3seconds, and the current
node is elected as master. I have the AWS security group is to allow
9200-9400 TCP between all elasticsearch nodes.

Inevitably if I just restart the java instance on the node it comes back
immediately.

Finally, in the cases I have observed this it has been on a newly
instantiated instance; is there a possibility there is some startup lag on
security groups within AWS?

Is there additional information that I could provide that would be useful?
Has anyone else seen this before?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I saw that also recently. When starting, ES node takes some seconds to get answer back from EC2 API. After a restart, answer was immediate.
I suppose that's the same cause here.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 12 août 2013 à 19:53, David Cross dcross@whosay.com a écrit :

I have setup an elasticsearch cluster using cloud-aws and it appears that randomly nodes will not join the cluster; I don't see any error messages, just the message that nothing was received in ~3seconds, and the current node is elected as master. I have the AWS security group is to allow 9200-9400 TCP between all elasticsearch nodes.

Inevitably if I just restart the java instance on the node it comes back immediately.

Finally, in the cases I have observed this it has been on a newly instantiated instance; is there a possibility there is some startup lag on security groups within AWS?

Is there additional information that I could provide that would be useful? Has anyone else seen this before?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ok;

In your case did the nodes ever converge? Its one thing (not great, but
perhaps livable) if the nodes eventually see each other and start talking
after a couple of minutes. But its totally unacceptable if they are
forever orphaned.

Did you come up with a workaround? (automated?)

On Monday, August 12, 2013 3:13:27 PM UTC-4, David Pilato wrote:

I saw that also recently. When starting, ES node takes some seconds to get
answer back from EC2 API. After a restart, answer was immediate.
I suppose that's the same cause here.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 12 août 2013 à 19:53, David Cross <dcr...@whosay.com <javascript:>> a
écrit :

I have setup an elasticsearch cluster using cloud-aws and it appears that
randomly nodes will not join the cluster; I don't see any error messages,
just the message that nothing was received in ~3seconds, and the current
node is elected as master. I have the AWS security group is to allow
9200-9400 TCP between all elasticsearch nodes.

Inevitably if I just restart the java instance on the node it comes back
immediately.

Finally, in the cases I have observed this it has been on a newly
instantiated instance; is there a possibility there is some startup lag on
security groups within AWS?

Is there additional information that I could provide that would be useful?
Has anyone else seen this before?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I was just running some tests for aws discovery plugin. It was not in production.
I just noticed that strange delay I've never seen before.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 12 août 2013 à 22:03, David Cross dcross@whosay.com a écrit :

Ok;

In your case did the nodes ever converge? Its one thing (not great, but perhaps livable) if the nodes eventually see each other and start talking after a couple of minutes. But its totally unacceptable if they are forever orphaned.

Did you come up with a workaround? (automated?)

On Monday, August 12, 2013 3:13:27 PM UTC-4, David Pilato wrote:

I saw that also recently. When starting, ES node takes some seconds to get answer back from EC2 API. After a restart, answer was immediate.
I suppose that's the same cause here.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 12 août 2013 à 19:53, David Cross dcr...@whosay.com a écrit :

I have setup an elasticsearch cluster using cloud-aws and it appears that randomly nodes will not join the cluster; I don't see any error messages, just the message that nothing was received in ~3seconds, and the current node is elected as master. I have the AWS security group is to allow 9200-9400 TCP between all elasticsearch nodes.

Inevitably if I just restart the java instance on the node it comes back immediately.

Finally, in the cases I have observed this it has been on a newly instantiated instance; is there a possibility there is some startup lag on security groups within AWS?

Is there additional information that I could provide that would be useful? Has anyone else seen this before?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.