Data node being overallocated

I have 4 data only nodes in my cluster and one of them is being significantly over-allocated.

~400GB more in a ~3.5TB cluster is being stored on this node. I am not using shard allocation filtering, or any of the advanced techniques which force certain shards (or indexes) to certain nodes, and elasticsearch is auto-generating document IDs.

The data is almost entirely log data, indexed by logstash into daily indexes across ~15 different types of log data (1 index per type of log per day). 3 shards and 1 replica per index. Index sizes range from ~5gb to ~100gb each. I am on ES 1.6 (working towards upgrading to 2.3).

I'm not sure how to diagnose and/or fix and would appreciate any help or insight. Thanks!

If you need to move a shard from one data node to the other, you can take a look at the page below

https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html

Thanks @thn!

I have seen this page and will definitely use it to clean things up manually, but I'm more concerned about why the over-allocation is happening and how to correct it so that I don't need to manually manage shard allocation on a regular basis (which shouldn't be something one needs to do).

If I have to guess, you may start out with 3 data nodes, not 4, with the assumption that these machines have the same HW spec or check the configuration to see anything abnormal that may cause this issue.

I did start with 3 data nodes, then increase to 4.

All machines have identical specs and share a configuration. Perhaps interestingly, it is the 3rd of 4 (and one of the original 3) that is being overallocated.

What do you see when you run this from a browser?

http://[es node]:[es port]/_cat/shards/[index name]?v

I had to scrub the output a bit, but the number of docs and store size are all very close to each other on all nodes for the index I sampled.

index shard prirep state docs store ip node
index-2016.04.20 0 p STARTED # size 127.0.0.1 data3
index-2016.04.20 0 r STARTED # size 127.0.0.1 data4
index-2016.04.20 1 p STARTED # size 127.0.0.1 data2
index-2016.04.20 1 r STARTED # size 127.0.0.1 data1
index-2016.04.20 2 r STARTED # size 127.0.0.1 data3
index-2016.04.20 2 p STARTED # size 127.0.0.1 data1

Due to the mismatch between the number of nodes and the number of shards/replicas, there is always one node which holds only 1 shard of each index while all other nodes hold 2. This too seems reasonably well distributed across all nodes.

Is there a way to tell what shards only exist on one node?

It's okay.. under "prirep" column, it tells you the shard number. If you go to the path.data location, you'll see the index name then "shard number" there too.

So are you concerning about data1 and data3 holding more shards than data2 and data4? With 3 shards, 1 replica in 4 data-node setup, it's kind of hard for ES to distribute them "evenly" as you said. If you don't want a shard on a specific node, I think you have to move it yourself unless when you re-index the data into a new index with 4 shards and 1 replica.

Sorry, I think I may not have communicated the issue fully.

The issue is that data3 is using ~400GB more data than data1, data2, and data4 (which are all relatively close to equal usage)

We have approximately 15 indexes per day, and the distribution to data nodes via a spot check across random indexes seems to be fine. I'm going to try and analyze in more detail to see if there is a pattern of indexes that are only on data3 or more heavily favor data3.

Thanks for your help

As a follow-up, I've done some analysis by parsing all shards from all indices via the _cat APIs and made a list of which indices don't have at least 1 shard per data node, and indices which only have shards on one node. There were none for either test.

The overall shard count is also within 15 on all 4 nodes, yet there are still several hundred more gigabytes of data used on "data3" than on any of data1, data2, or data4.

Any other thoughts or insights would be greatly appreciated

You can check the number of segments per shard and/or try to merge them.