Elasticsearch unassigned shards

Hi all,

I have an ES cluster, made from 3 nodes, all are master/ingest/data. The cluster is configured with 5 shards and 1 replica, so 5 primary shards and 5 replica shards.
After a restart of the cluster, I have a lot of unassigned shards, theoretically all of the replica shards are unallocated. The thing is I don't have space anymore on the nodes and as far as I understand it cannot allocate again the replicas because of this.
When the nodes were restarted I didn't had the index.unassigned.node_left.delayed_timeout option set so the cluster started to put the shards as unassigned.

At the moment I see that the ES used spaced is smaller then disk used:

   shards disk.indices disk.used disk.avail disk.total disk.percent host          ip            node
   119      119.1gb   587.1gb     67.3gb    654.5gb           89 10.224.10.85  10.224.10.85  elasticsearch-2
   424      519.8gb   559.7gb     94.7gb    654.5gb           85 10.224.10.84  10.224.10.84  elasticsearch-1
   251      267.9gb   556.8gb     97.6gb    654.5gb           85 10.224.10.167 10.224.10.167 elasticsearch-3
   782                                                                                       UNASSIGNED

How can I free the space In order to be the cluster able to assign again the replicas?

Thank you!!

If you are using 1 replica you will need minimum 2 data nodes. Are your ingest and / or master nodes also data nodes?

All 3 nodes are master & data & ingest. Thanks for your fast response!
I guess I found a similar situation here Old stale shards in path.data.

The only thing I cannot understand why the old shards are not being seen by the cluster or deleted, because the space appear to be used.

You are over the 85% watermark which is affecting reallocation. Try freeing up space by deleting data or temporarily reducing the number of replicas for some Indices so it can rebalance.

2 Likes

If it gets rebalanced it will clean by itself the space occupied for the old shards?
Because it worries me the fact that for the elasticsearch-1 the disk.used is much bigger that the disk.indices.

   shards disk.indices disk.used disk.avail disk.total disk.percent host          ip            node
   119      119.1gb   587.1gb     67.3gb    654.5gb           89 10.224.10.85  10.224.10.85  elasticsearch-2

I believe so.

Hi,
I have set the low watermark to 90%, deleted some old data and reduced the number of replicas for some indices. The health of the cluster is green now but still I guess I have duplicate data kept locally. How can I delete unused replica copies from the nodes?
Thanks!

What do you mean 'unused' replica shards as it should remove them when you reduce the replica count?

Also why 5 primary shards, just for sizing to say under 50GB/shard? You only have 3 nodes, so I'd think 1-2 shards would be optimal (1 for simplicity unless you have performance issues). Usually add shards to spread among nodes or for max sizing, else shards+replicas > nodes not very useful.

Hi Steve,

Before restart of the nodes, I was able to keep almost 90 days of logs and I was at the limit of the low watermark which was the default one 85%. Now I have almost 80 days of logs and the low watermark is at 90% and again I am the limit of disk usage.

I am thinking of the moment when a node got restarted and the cluster started to allocate new replicas to the other nodes. Then, when the node came back I guess it still has the replica copies on it (so i will have the primary, the old replicas and the new replicas-from the reallocation stored locally) so my disk usage has grown in total.
This I cannot understand: when a node fails, the primary shards which were on the node are lost and the existing replicas get promoted to primary and then the cluster allocates new replicas on the nodes that are online for those shards. But what about the old replicas from the offline node? Should they be deleted when the node comes back?

I guess the configuration is the default one(5 shards and 1 replica), and it wasn't tuned at that moment. I will take into consideration your informations when starting adding more disks and tuning the cluster.

Thank you!

Yes, Elasticsearch deletes any unused copies of shards when the shard reaches green health.

Having said that, you say your cluster is entirely at green health but there's still a lot of space that's unaccounted for? That is puzzling. Can you look at the files on disk to determine what's taking up the space that shouldn't be there any more? Each shard is stored in $DATA_PATH/nodes/0/indices/$INDEX_UUID/$SHARD_NUMBER.

I assume you mean when the 'index' reaches green (not the shard)? So it'll leave all shards, good or not, in place until it gets to green, then the master will tell nodes to delete anything left over/stale?

No, I meant the shard.

Ah, yes, forgot about that as part of shard stores, as seems you can query on it, but not shown nor returned anywhere (GET /_shard_stores?status=green)

Sorry for assuming it was a mistake as not seen it referenced before. So it's good the cluster clears shard pieces and stale copies rapidly as the shards get well, not waiting for even the whole index.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.