3 coordinator, 3 master, 14 data
6 shards, 1 replica
All primary shards of some indices keep getting assigned to only 1 of 14 data nodes. So data node 12 has 6 primary shards of index x, y, z, etc. The other 13 nodes are normal.
This results in high JVM heap usage for data node 12 and also unassigned shards (reason:
Why would this happen for only data node 12?
What I have tried:
- Find each index with all of its primary shards on data node 12 and set
index.routing.allocation.total_shards_per_nodeto 1 or 2
a. This helps but all primary shards of some new indices still get allocated to data node 12
What I have not tried:
cluster.routing.allocation.total_shards_per_nodein index template
a. Not sure how this would work when setting
index.routing.allocation.require._name(I cannot use ILM for some reason so I wrote the shrink process myself and I need all primary shards for an index on the same node in order to shrink. See here)
EDIT: a coworker found this and it looks like this is the problem.