After more testing,
It seems that restarting the service should be sufficient, you just need to
be patient and wait. If the cluster (master specifically?) isn't too busy
that joining times out (which is configurable in elasticsearch.yml),
eventually it'll join.
In light of this, I'm going to generally modify the timeout for a much,
much longer than default value unless someone can describe a downside.
And, I am curious whether the new candidate node needs to connect
specifically to a Master instead of just any node in the cluster... The
docs and descriptions I've read so far only describe contacting the cluster
generally.
Am also curious (short of packet sniffing) if in the act of joining the
candidate node repeatedly sends requests to join at what interval (is it
close to a broadcast storm or very pedestrian or maybe only once?)
Tony
On Tuesday, February 4, 2014 4:22:58 PM UTC-8, Tony Su wrote:
Hi Mark,
I've done all that to no effect.
FYI if it makes a diff,
I'm running on a distro that uses systemd, so in theory when the Service
is started, it's supposed to create a cgroup in which the new process is
run, and if there are any processes that are spawned (including but not
limited to new ES processes), they're all supposed to be managed by that
cgroup. This generally means that compared to SystemV when the cgroup is
shutdown, it shuts down all child processes reliably, there are no orphaned
processes that continue to run.
So, when I stop the ES service, it really should be shutdown.
But, when I start up again I've waited over 5 minutes on a small but
active cluster accepting new data and the node never joins.
But, after rebooting the orphaned node, and starting the ES service it
rarely takes more than about 15 seconds to join (according to ES-head).
Tony
On Tuesday, February 4, 2014 2:10:14 PM UTC-8, Mark Walkom wrote:
If you give the service a restart, it's a stop and then a start
(obviously).
This will/should reread the config and attempt to rejoin the cluster in
the config.
Can you try an explicit stop, then sleep for 5, then start? It could be
the process isn't properly closing when requested.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 5 February 2014 04:22, Tony Su tony...@gmail.com wrote:
Unless I'm missing something in the docs or these forums,
I've surprisingly found that if a node fails to join the cluster, it's
not sufficient to simply restart ES on the machine. I would have thought
that restarting ES thereby re-reading its config files should be sufficient
to announce its intention to join the cluster.
But, I haven't found that to be the case, every time I've had to reboot
the entire machine to join the cluster.
Is there a config I'm missing?
Thx,
Tony
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/02c4b578-f430-44ba-a98c-7337b684125d%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7b690bde-71ff-415f-994e-2031662e522c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.