Very bad shard allocation

Hi.
We have Elasticsearch cluster, which consists from:

  • 3 dedicated servers (A+B+C)
    • 48 threads, 128 GB RAM, 1.8TB SSD
    • 1 instance of master+client node, 10 GB heap size
    • 1 instance of data-node, 30 GB heap size
  • 1 dedicated server (D)
    • 64 threads, 128 GB RAM, 1.8TB SSD
    • 1 instance of data-node, 30GB heap size

About 2 weeks ago, some really big performance troubles started. It looks, that the bottleneck is server D. It looks, that there is slow GC,.. so we decided to split the server, so there is currently 2 data-nodes, each with 20GB heap size (shared filesystem, because we dont want to reinstall the server). Unfortunately Elasticsearch decides to allocate as much as possible of data to this 2 nodes, so this server is preasumbly "totaly in fire".

image
(d228 is server D)

Is there some way, how to tell cluster to not allocate so many data on one node? Or simply rebalance shards based on server load?

Elasticsearch by default assumes all nodes are equal so will allocate an equal portion of data to all nodes which will lead to an imbalance in your case. Why did you decide to run two nodes on that host?

I think, that this is very bad assumption. The nodes A-C were bought at same time, they have same CPU,... The node D was the newest one, has newer CPU (more cores) - it doesnt make sense to buy server with old CPU only because of Elasticsearch needs "same" servers.

There were some spikes in heap usage and cant increase it - because of 30G limit. This instance was the only one, that crashed (once), because not enough memory. We also want to reduce GC times. It also looks, that there were minimal CPU usage.
It "helps" in some way.. seerver is now fully-loaded :slight_smile:

I think, that if we split also remaining 3 servers (so we get 8 almost-equal servers), it can help to balance whole cluster.

Is it necessary to have different volumes for this data-instances or they can "live" on same partition without any problem? Because it looked, that it can be maybe also problem.. for example:
node A decides to move 50GB shard to node D-1 (because there is enaugh space), node B decides to move 50GB shard to node D-2 (which has same volume asi D-1). But when moving finished, node D-1/D-2 has not enough free space, so they must move some shards to different nodes.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.