Unbalanced cluster - one node running out of space

I have a cluster with 1 dedicated master node, 1 client node, and 5 data nodes..

While running a particularly heavy aggregation, the cluster turned yellow and a few replica shards got stuck in Initializing state. After the aggregation was ran again, a lot of replica shards (~1/6th ) became unassigned.

Now I see that one of the data nodes has been assigned a bulk of the primary shards and it is running out of disk space fast.

All data nodes have 500gb disks, and the other 4 have between 100 and 150 GB of disk space free. The 5th one has less than 20.

How can I remedy this? Will moving shards away from this node help?

EDIT: I'm using Elasticsearch version 1.7.1

I would suggest you to first check your _routing field


There is a chance that your ids (by default _routing field uses ids) are not well distributed, causing the routing to crowd certain nodes.

Elasticsearch auto generates ids, so I doubt thats the case

Do you have parent-child relationship in your index?

No. Its a standard logstash populated index

If so, probably you hit a bug (or hit by one). See if this helps https://github.com/elastic/elasticsearch/pull/14494

Thanks @Josh_J_Luo, I'll take a look.

The cluster recovered on its own though. The problematic node went from almost running out of disk space (<1gb) to having over a 100gb free.