Primary and Replica index keep appearing on the same node

I run a 4 node cluster with 3 data nodes. The other node is used to load balance Kibana traffic. I keep getting Primary and Replica index on the same node and not sure how this is happening. Here is an example from a curl -X GET "10.200.100.100:9200/_cat/shards" command:

filebeat-6.6.2-2019.07.12 2 p STARTED 119997 21.7mb 10.200.100.102 elastic02
filebeat-6.6.2-2019.07.12 2 r STARTED 119997 21.7mb 10.200.100.103 elastic03
filebeat-6.6.2-2019.07.12 0 p STARTED 121126 22mb 10.200.100.101 elastic01
filebeat-6.6.2-2019.07.12 0 r STARTED 121126 21.9mb 10.200.100.102 elastic02
.monitoring-es-7-2019.08.01 0 p STARTED 763926 243.5mb 10.200.100.103 elastic03
.monitoring-es-7-2019.08.01 0 r UNASSIGNED
winlogbeat-6.4.2-2019.07.29 1 p STARTED 187 276.3kb 10.200.100.101 elastic01
winlogbeat-6.4.2-2019.07.29 1 r STARTED 187 252.6kb 10.200.100.103 elastic03
winlogbeat-6.4.2-2019.07.29 2 p STARTED 184 236.5kb 10.200.100.101 elastic01
winlogbeat-6.4.2-2019.07.29 2 r STARTED 184 236.5kb 10.200.100.102 elastic02
winlogbeat-6.4.2-2019.07.29 0 p STARTED 173 248.2kb 10.200.100.101 elastic01

Two issues, how to fix the data already there and where to look for the source?

In the output you shared every primary is on a different node from its replica, which is what we'd expect. Can you clarify what you think is wrong with this output?

Yeah see your point.

OK so my cluster is in a yellow state with 28 unassigned shards. When I run curl -X GET "10.200.100.100:9200/_cluster/allocation/explain" one of the things I notice is:

{"decider":"same_shard","decision":"NO","explanation":"the shard cannot be allocated to the same node on which a copy of the shard already exists [[winlogbeat-6.7.1-2019.07.29]

If I then run curl -X GET "10.200.100.100:9200/_cat/shards" | grep "winlogbeat-6.7.1-2019.07.29"

I see this:

winlogbeat-6.7.1-2019.07.29 1 p STARTED 59335 65.6mb 10.200.100.102 elastic02
winlogbeat-6.7.1-2019.07.29 1 r UNASSIGNED
winlogbeat-6.7.1-2019.07.29 0 p STARTED 59586 66.5mb 10.200.100.101 elastic01
winlogbeat-6.7.1-2019.07.29 0 r STARTED 59586 66.1mb 10.200.100.102 elastic02

Right, that's saying it can't allocate this replica to that particular node because there's already a copy of that shard there. It's probably more interesting to focus on why it can't allocate a copy of that shard to any of the other nodes instead.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.