What is the best practice for stopping and starting a running cluster?
2 Master only nodes - each on their own box
52 Data only nodes - spread across 6 boxes with 12,12,12,2,7,7 nodes on
Each node is running under supervision (supervisord) so that they will be
restarted if they crash on their own.
turn off rivers (no more ingest)
turn off shard allocation
stop and restart nodes on each box using supervisorctl
$>supervisorctl stop elasticsearch-1
$>supervisorctl start elasticsearch-1
wait for "initializing_shards" count to reach 0
turn on shard allocation
wait for "unassigned_shards" count to reach 0
turn on rivers
We almost always end up with one or a combination of several isssue :
- nodes pegged on heap and un responsive (cluster cant communicate with
them, they are not hittable via api)
- nodes stuck initializing shards forever
- nodes stuck allocating shards forever
- "ghost" nodes; a second copy of a node in the cluster state (NOT process
actually running) with that same name, different id. This actually doesnt
affect es performance much but it makes es-head and other tools break due
duplicate node/key name.
Some times, repeated opening and closing and index will get its shards to
allocate and initialize. Sometimes not.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8561442-69a3-4ca1-bfbc-06c45bec39e6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.