Cluster in Yellow due to replica not assigned

Hi team

I have .watches replica in UNASSIGNED state due to which my cluster in showing Yellow

.watches 0 p STARTED
.watches 0 r UNASSIGNED

I restarted the cluster, updated the replica not nothing working. I also tried to delete but that too not working.

Please let me know how to fix this?

Please confirm you replica num equals you node num, or there will be shards unassigned.

How to find that?

{
"cluster_name": "test",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 4,
"number_of_data_nodes": 3,
"active_primary_shards": 1,
"active_shards": 1,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 1,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 98.7012987012987
}

GET _nodes/stats to see how many data node you have. And GET _cat/shards to check how many replica of .watches.

Hi Ethan,

following are the info

ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
x.x.201.157 65 57 5 0.12 0.13 0.15 mdi * k01
x.x.72.39 47 89 5 0.08 0.11 0.04 - - b01
x.x.75.23 41 69 39 0.07 0.10 0.09 mdi - b03
x.x.75.24 29 70 59 1.28 0.29 0.09 mdi - b04

.watches

.watches 0 p STARTED 6 83.5kb x.x.75.23 b03
.watches 0 r UNASSIGNED

{
"cluster_name": "test",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 4,
"number_of_data_nodes": 3,
"active_primary_shards": 1,
"active_shards": 1,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 1,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 98.7012987012987
}

GET /_cluster/allocation/explain?pretty to see why the replica shard is unassigned.

{
"index": ".watches",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "REPLICA_ADDED",
"at": "2018-08-24T04:27:24.515Z",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "27PGffnYSWuGtsqrBxS9Uw",
"node_name": "b04",
"transport_address": "x.x.75.24:9300",
"node_attributes": {
"ml.machine_memory": "16725929984",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "filter",
"decision": "NO",
"explanation": """node does not match index setting [index.routing.allocation.include] filters [role:"watcher"]"""
}
]
},
{
"node_id": "jIsi4SI_SQiccMQ698ylUw",
"node_name": "K04",
"transport_address": "x.x.201.157:9300",
"node_attributes": {
"ml.machine_memory": "12491837440",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "filter",
"decision": "NO",
"explanation": """node does not match index setting [index.routing.allocation.include] filters [role:"watcher"]"""
}
]
},
{
"node_id": "vLEamAVtR3CmX-2F0UszZg",
"node_name": "b03",
"transport_address": "x.x.75.23:9300",
"node_attributes": {
"ml.machine_memory": "16725929984",
"ml.max_open_jobs": "20",
"xpack.installed": "true",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [
{
"decider": "filter",
"decision": "NO",
"explanation": """node does not match index setting [index.routing.allocation.include] filters [role:"watcher"]"""
},
{
"decider": "same_shard",
"decision": "NO",
"explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[.watches][0], node[vLEamAVtR3CmX-2F0UszZg], [P], s[STARTED], a[id=hBga4WUvRfKmzroDzTH_rA]]"
}
]
}
]
}

Here is the reason "explanation": """node does not match index setting [index.routing.allocation.include] filters [role:"watcher"]"""

how and what to do?

GET .watches/_settings to see if there is [index.routing.allocation.include] settings.

{
".watches": {
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"size": "big,medium",
"role": "watcher"
}
}
},
"number_of_shards": "1",
"auto_expand_replicas": "0-1",
"provided_name": ".watches",
"format": "6",
"creation_date": "1534905070718",
"priority": "800",
"number_of_replicas": "1",
"uuid": "P4L3soP2TcqedB2yEuDtaw",
"version": {
"created": "6030299"
}
}
}
}
}

I update with
PUT .watches/_settings
{
"index.routing.allocation.include.size": "big,medium"
}

Still its watches 0 r UNASSIGNED

What is the output of the cat nodeattrs API?

K04 x.x.201.157 x.x.201.157 ml.machine_memory 12491837440
K04 x.x.201.157 x.x.201.157 ml.max_open_jobs 20
K04 x.x.201.157 x.x.201.157 xpack.installed true
K04 x.x.201.157 x.x.201.157 ml.enabled true
b01 x.x.72.39 x.x.72.39 ml.machine_memory 16725929984
b01 x.x.72.39 x.x.72.39 ml.max_open_jobs 20
b01 x.x.72.39 x.x.72.39 xpack.installed true
b01 x.x.72.39 x.x.72.39 ml.enabled true
b03 x.x.75.23 x.x.75.23 ml.machine_memory 16725929984
b03 x.x.75.23 x.x.75.23 ml.max_open_jobs 20
b03 x.x.75.23 x.x.75.23 xpack.installed true
b03 x.x.75.23 x.x.75.23 ml.enabled true
b04 x.x.75.24 x.x.75.24 ml.machine_memory 16725929984
b04 x.x.75.24 x.x.75.24 ml.max_open_jobs 20
b04 x.x.75.24 x.x.75.24 xpack.installed true
b04 x.x.75.24 x.x.75.24 ml.enabled true

delete the routing setting if you don't need it.

I run it to see it fixs

Whats next?

I tried to delete but getting the below error

{
"error": "This endpoint is not supported for DELETE on .watches index.",
"status": 400
}