While this is difficult to troubleshoot without more allocation, the first things that occur to me are:
- Are the overloaded nodes running anything besides Elasticsearch? For example, some users run other software alongside Elasticsearch on the same servers, such as Kibana. This could cause high resource usage on those nodes.
- Do all the nodes have the same hardware configuration?
These may seem obvious, but just in case.
As mentioned, that is a very high number of shards - especially if the indexes are configured with replicas. This is likely causing high memory usage, which may cause high CPU usage because garbage collection needs to happen more often.
It looks like you have about 3GB/index from the numbers you give. I recommend using weekly indices, instead of daily - this should give you an index size of ~21GB. Further, after you rollover each index and are finished writing data to it, you should likely shrink each index to one shard.
To answer the other part of your question, Elasticsearch will already take into account disk space (although not CPU or memory usage, as far as I am aware) when allocating shards. You can tune the parameters it uses via the