Way to route primary shards back to other nodes in case of data node failure in cluster

Ramchandra_Phadake · November 18, 2014, 11:42am

Hi,

I tried manually moving the shards using cluster reroute API & it works,
just wanted to know if there is any setting which can help in below
scenario.

Here are my steps: Simulating a production node down scenario due to
maintenance upgrades.

Our setup: 1.0 ES , windows, all indices have 4 shards & zero replicas.

node1 : master + data

node2 : data

node3 : data

Steps:

start node1 & index some data
start node2 & continue indexing data.

----At this point the primary shards are split across 2 nodes evenly.

stop node2 & continue indexing data.

---- cluster goes in red state. All node2 shards go in unassigned state.

---- Indexing errors with UnavailableShardsException for all node 2 shards.

----Following is seen for all node2 shards from node1 logs.

  [2014-11-18 15:56:46,482][DEBUG][gateway.local            ] [Lament]

[data][0]: not allocating, number_of_allocated_shards_found [0],
required_number [1]

start node3 & continue indexing data.

---All shards from node1 gets further split & node3 also gets half of the
shards

-- cluster remains in red state & continue to get with
UnavailableShardsException for all node 2 shards.

I tried various settings for "index.recovery.initial_shards" but none
helped.

My question in case if a node goes down abruptly, is there a way to
automatically route the unassigned primary shards to other nodes?

Any pointers on how to move the unassigned shards back to other nodes
without manual cluster route API?

Thanks,

Ram

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d11027f7-3530-48bc-88e8-74826d95ddb5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · November 18, 2014, 8:16pm

You can't do this based on your setup because no other node has those
shards.
The only way to be able to do it would be to set replicas = 1.

You should also really allow all 3 nodes to be masters as it provides
better redundancy.

On 18 November 2014 22:42, Ramchandra Phadake ramchandra.phadake@gmail.com
wrote:

Hi,

I tried manually moving the shards using cluster reroute API & it works,
just wanted to know if there is any setting which can help in below
scenario.

Here are my steps: Simulating a production node down scenario due to
maintenance upgrades.

Our setup: 1.0 ES , windows, all indices have 4 shards & zero replicas.

node1 : master + data

node2 : data

node3 : data

Steps:

start node1 & index some data

start node2 & continue indexing data.

----At this point the primary shards are split across 2 nodes evenly.

stop node2 & continue indexing data.

---- cluster goes in red state. All node2 shards go in unassigned state.

---- Indexing errors with UnavailableShardsException for all node 2 shards.

----Following is seen for all node2 shards from node1 logs.
  [2014-11-18 15:56:46,482][DEBUG][gateway.local            ] [Lament]
[data][0]: not allocating, number_of_allocated_shards_found [0],
required_number [1]

start node3 & continue indexing data.

---All shards from node1 gets further split & node3 also gets half of the
shards

-- cluster remains in red state & continue to get with
UnavailableShardsException for all node 2 shards.

I tried various settings for "index.recovery.initial_shards" but none
helped.

My question in case if a node goes down abruptly, is there a way to
automatically route the unassigned primary shards to other nodes?

Any pointers on how to move the unassigned shards back to other nodes
without manual cluster route API?

Thanks,

Ram

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d11027f7-3530-48bc-88e8-74826d95ddb5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d11027f7-3530-48bc-88e8-74826d95ddb5%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAF3ZnZ%3D7nRB-AQX1BQ9TKwfP8%3DEmBu%2Bo9yVvWfqLBTHv-g6zmQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Ramchandra_Phadake · November 19, 2014, 6:16am

Thanks Mark.

Due to replicas this may not become a priority but IMO rerouting such
primary shards after certain duration is good.
Only side effect is when the same node comes back we need to do merge by
appending latest changes to older shards.

Thanks,
Ram

On Wednesday, November 19, 2014 1:46:55 AM UTC+5:30, Mark Walkom wrote:

You can't do this based on your setup because no other node has those
shards.
The only way to be able to do it would be to set replicas = 1.

You should also really allow all 3 nodes to be masters as it provides
better redundancy.

On 18 November 2014 22:42, Ramchandra Phadake <ramchandr...@gmail.com
<javascript:>> wrote:
Hi,

I tried manually moving the shards using cluster reroute API & it works,
just wanted to know if there is any setting which can help in below
scenario.

Here are my steps: Simulating a production node down scenario due to
maintenance upgrades.

Our setup: 1.0 ES , windows, all indices have 4 shards & zero replicas.

node1 : master + data

node2 : data

node3 : data

Steps:

start node1 & index some data

start node2 & continue indexing data.

----At this point the primary shards are split across 2 nodes evenly.

stop node2 & continue indexing data.

---- cluster goes in red state. All node2 shards go in unassigned state.

---- Indexing errors with UnavailableShardsException for all node 2
shards.

----Following is seen for all node2 shards from node1 logs.
  [2014-11-18 15:56:46,482][DEBUG][gateway.local            ] 
[Lament] [data][0]: not allocating, number_of_allocated_shards_found [0],
required_number [1]

start node3 & continue indexing data.

---All shards from node1 gets further split & node3 also gets half of the
shards

-- cluster remains in red state & continue to get with
UnavailableShardsException for all node 2 shards.

I tried various settings for "index.recovery.initial_shards" but none
helped.

My question in case if a node goes down abruptly, is there a way to
automatically route the unassigned primary shards to other nodes?

Any pointers on how to move the unassigned shards back to other nodes
without manual cluster route API?

Thanks,

Ram

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d11027f7-3530-48bc-88e8-74826d95ddb5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d11027f7-3530-48bc-88e8-74826d95ddb5%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff065e48-6021-44a5-b1b6-5af583811551%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Unassigned primary and replica shards Elasticsearch	6	2123	July 6, 2017
Assign unassigned primary shard Elasticsearch	7	2481	July 6, 2017
Allocation/re-routing of unassigned shards Elasticsearch	3	2223	December 12, 2017
Cluster reroute and potential data loss Elasticsearch	4	3721	July 6, 2017
Restarting one of the nodes resulted in unassigned shards Elasticsearch	4	2663	July 6, 2017

Way to route primary shards back to other nodes in case of data node failure in cluster

Related topics