We have Elasticsearch cluster, which consists from:
- 3 dedicated servers (A+B+C)
- 48 threads, 128 GB RAM, 1.8TB SSD
- 1 instance of master+client node, 10 GB heap size
- 1 instance of data-node, 30 GB heap size
- 1 dedicated server (D)
- 64 threads, 128 GB RAM, 1.8TB SSD
- 1 instance of data-node, 30GB heap size
About 2 weeks ago, some really big performance troubles started. It looks, that the bottleneck is server D. It looks, that there is slow GC,.. so we decided to split the server, so there is currently 2 data-nodes, each with 20GB heap size (shared filesystem, because we dont want to reinstall the server). Unfortunately Elasticsearch decides to allocate as much as possible of data to this 2 nodes, so this server is preasumbly "totaly in fire".
(d228 is server D)
Is there some way, how to tell cluster to not allocate so many data on one node? Or simply rebalance shards based on server load?