Shay,
Thanks! I certainly didn't expect it, much appreciated!
Matt
On Apr 13, 6:29 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
Heya, great that you improved that!. I just pushed improvements to unicast discovery (and the ec2 discovery) to do the connection attempts in parallel (when needed) so it will be a bit speedier even in this case you mentioned.
On Wednesday, April 13, 2011 at 5:34 PM, Matt Paul wrote:
Shay,
After following your advice, looking through the trace, I found what
the issue was, thanks! Apparently when the list of ec2 nodes in the
security group is retrieved, it then attemps connect (in a blocking
fashion possibly?) to each one in turn. Turns out that the
Elasticsearch nodes that I'm using are the first and the last nodes in
a list of about 20 instances, so it took quite a while for them to
find each other, Once I added specific tags to just the Elasticsearch
ec2 instances and changed the ES config to search just for those
tags, it comes up in seconds now.Thanks for all the help
Matt
On Apr 12, 5:38 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
Mmm, that should not take this long.... . Maybe it takes time for the describe instances API call on ec2 (which is very strange). Which version are you running?
Can you set discovery to TRACE logging in the logging.yml file (similar to how action is set there) and gist the logs for both nodes? We can try and derive the timings from it.
Also, the indices status API should give you a report of the time it took for the primary shards to recover from s3 (and if data was reused from local file system), and the time it took to recover for a replica shard to sync its state with the primary shard.
Last, I just noticed in your config that you do not set the recover_after_nodes setting. Can you just set it to 2 and see? The expected nodes does not affect things without the recover_after_nodes being set (and possibly, recover_after_time). This will make sure that the best local node data reuse deployment scenario will be taken.
-shay.banon
On Wednesday, April 13, 2011 at 1:26 AM, Matt Paul wrote:
It takes about 5 1/2 - 6 minutes (pretty consistently) for one of the
nodes to become the master. Once that happens, checking the status on
that node shows yellow. Until the other node joins the cluster, I just
get the "MasterNotDiscovered" error when trying to do any status, etc
on itOn Apr 12, 5:12 pm, Shay Banon shay.ba...@elasticsearch.com wrote:
And how quick do you get to yellow status? Basically, a primary shard and its replica might drift in the index files (but not in content), and might require resync.
On Wednesday, April 13, 2011 at 1:11 AM, Matt Paul wrote:
Shay,
I'm restarting it on the same instances. All I am doing is issue a
_shutdown with curl (curl -XPOST "http://localhost:9200/_shutdown"),
waiting for it all to stop (basically instantly), then starting the
elasticsearch script again in the same instance. from that point,
until I get a green on the status has been as long as 20 minutes