What steps must to be performed to recover a failed node in a cluster?
You should just be able to restart it and it'll rejoin.
Though that may depend on the failure mode.
What if
unassigned_shards: > 0
OR
initializing_shards: > 0
OR in general, Cluster health is RED or Yellow
If it's initialising then just wait for everything to finish.
If they're unassigned then, it depends.
What version are you on?
We are still on old version 1.7.5 though we are planning to upgrade.
This is our current status:
cluster_name: "abc",
status: "red",
timed_out: false,
number_of_nodes: 2,
number_of_data_nodes: 2,
active_primary_shards: 4226,
active_shards: 8452,
relocating_shards: 0,
initializing_shards: 0,
unassigned_shards: 3781,
delayed_unassigned_shards: 0,
number_of_pending_tasks: 0,
number_of_in_flight_fetch: 0
That's faaaaaaar too many shards and likely to be causing your issues.
We have got around 800+ indices. How can this be resolved ?
Reindex. You will need to do that to get to 5.X anyway, so it might make sense to start a 5.X cluster, then run a remote index to pull the data from that old cluster - assuming you still want it.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.