I have 4 data only nodes in my cluster and one of them is being significantly over-allocated.
~400GB more in a ~3.5TB cluster is being stored on this node. I am not using shard allocation filtering, or any of the advanced techniques which force certain shards (or indexes) to certain nodes, and elasticsearch is auto-generating document IDs.
The data is almost entirely log data, indexed by logstash into daily indexes across ~15 different types of log data (1 index per type of log per day). 3 shards and 1 replica per index. Index sizes range from ~5gb to ~100gb each. I am on ES 1.6 (working towards upgrading to 2.3).
I'm not sure how to diagnose and/or fix and would appreciate any help or insight. Thanks!