I would like to ask question regarding the shard allocation for rolling restart,.
I am trying to figure out why shard allocation happens after following steps for rolling restart. I have done below test.
Environment
- 1 cluster with 2 master eligible nodes.
- elasticsearch v6.1.1
Steps
1.Create index
PUT test-indices
{
"settings" : {
"number_of_shards": 2,
"number_of_replicas": 1
}
}
2.Check which node each shard is located. Primary shard are spread on each node.
GET _cat/shards/test-indices?v
index shard prirep state docs store ip node
test-indices 1 r STARTED 0 233b 172.31.27.43 node2
test-indices 1 p STARTED 0 233b 172.31.27.43 node1
test-indices 0 p STARTED 0 233b 172.31.27.43 node2
test-indices 0 r STARTED 0 233b 172.31.27.43 node1
3.Disable shard reallocation and do sync flush.
PUT _cluster/settings
{
"persistent" : {
"cluster.routing.allocation.enable" : "none"
}
}
POST _flush/synced
4.Shutdown node2
5.Start node2
6.Check shards before re-enabling shard allocation. All replicas are UNASSIGNED . Even the replica for shard 0 .
GET _cat/shards/test-indices?v
index shard prirep state docs store ip node
test-indices 1 p STARTED 0 264b 172.31.27.43 node1
test-indices 1 r UNASSIGNED
test-indices 0 p STARTED 0 264b 172.31.27.43 node1
test-indices 0 r UNASSIGNED
7 Also check the cluster state API and replica for shard 0 is UNASSIGNED
GET _cluster/state/routing_table/test-indices
{
"cluster_name": "mycluster",
"compressed_size_in_bytes": 15865,
"routing_table": {
"indices": {
"test-indices": {
"shards": {
"0": [
{
"state": "STARTED",
"primary": true,
"node": "g-_vH5rwT-inWJuhHyjHcA",
"relocating_node": null,
"shard": 0,
"index": "test-indices",
"allocation_id": {
"id": "QR-dWeL6ToGYShiHIROEDg"
}
},
{
"state": "UNASSIGNED",
"primary": false,
"node": null,
"relocating_node": null,
"shard": 0,
"index": "test-indices",
"recovery_source": {
"type": "PEER"
},
"unassigned_info": {
"reason": "NODE_LEFT",
"at": "2018-02-01T10:40:13.350Z",
"delayed": false,
"details": "node_left[nB9E_CoCQmuC-Gjx4gvucA]",
"allocation_status": "no_attempt"
}
}
],
"1": [
{
"state": "STARTED",
"primary": true,
"node": "g-_vH5rwT-inWJuhHyjHcA",
"relocating_node": null,
"shard": 1,
"index": "test-indices",
"allocation_id": {
"id": "5ZWVGFTgS0Gsg1R2Sf8iQQ"
}
},
{
"state": "UNASSIGNED",
"primary": false,
"node": null,
"relocating_node": null,
"shard": 1,
"index": "test-indices",
"recovery_source": {
"type": "PEER"
},
"unassigned_info": {
"reason": "NODE_LEFT",
"at": "2018-02-01T10:40:13.350Z",
"delayed": false,
"details": "node_left[nB9E_CoCQmuC-Gjx4gvucA]",
"allocation_status": "no_attempt"
}
}
]
}
}
}
}
}
It makes sense for replica shard on shard 1to be UNASSIGNED because it was promoted to primary shard, however, why is the replica shard for shard 0 UNASSIGNED if it exists on the disk ?