Old stale shards in path.data

asmith · July 10, 2020, 3:38am

Hello,

I just noticed that we have a massive shard imbalance in our 7.1.0 cluster. I'm assuming it must've happened due to many partial/failed recoveries or similar, and it seems that a lot of data must've just been left hanging in path.data.

We have been flirting with the low and sometimes high watermarks for a while, sometimes letting ES resolve it by rebalancing, but sometimes needing to manually intervene by growing one or more of the disks or adding a node.

GET /_cat/allocation?v&h=shards,disk.indices,disk.used,disk.avail,disk.total,disk.percent,node`

shards disk.indices disk.used disk.avail disk.total disk.percent node
    54        1.7tb     1.8tb    105.6gb      1.9tb           94 es-data4
    47        1.4tb     1.8tb     99.5gb      1.9tb           94 es-data1
    60        1.6tb     1.8tb    113.3gb      1.9tb           94 es-data7
    15        409gb     1.7tb    131.6gb      1.9tb           93 es-data6
    49        1.6tb     1.8tb    116.7gb      1.9tb           94 es-data5
     1         51mb     1.7tb    137.7gb      1.9tb           93 es-data3
    60        1.6tb     1.8tb    111.8gb      1.9tb           94 es-data8
    52        1.6tb     1.8tb    109.4gb      1.9tb           94 es-data2
   113                                                           UNASSIGNED

The 113 shards (all replicas) are unassigned because all the nodes are above the low watermark, despite two nodes having a huge difference between disk.indices and disk.used.

Our path.data lives on its own volume with nothing else stored there. I have confirmed that the space is being taken by what appear to be shard directories.
My questions are:

What is the safest way to remedy the current situation with es-data3 and es-data6 (and therefore our yellow, dangerously full cluster state)? Should I manually reroute these 1+15 active shards, stop the nodes, clear the data dirs, then start the nodes back up with empty disks? Could I possibly save a lot of data transfer by NOT clearing out the disks, since they likely have a lot of up-to-date segments within the shards?
Am I correct in saying that all of the nodes in our cluster are experiencing this problem of orphaned shards in path.data (to a much lesser degree), given that disk.indices and disk.used are not the same for any node? Is the cluster completely unaware of these shards? If so, is there a relatively easy way to locate which shard directories are orphaned on even these nodes and clear them out?
Is it a deadlock situation in which the replicas are out of date, and not sync'ing because the low watermark is exceeded?
If this is in fact due to failed shard replication attempts, should it not be cleaning these up?

Thanks!

DavidTurner · July 10, 2020, 7:35am

I think this is a state that the cluster can enter if you run it too close to capacity on all nodes. It's caused by Elasticsearch being very averse to deleting data that it might need in future: in particular it will only delete a shard copy once that shard is fully allocated elsewhere (i.e. green health) or the index is deleted.

The simplest fix is to delete some of your older indices to release some space, which will allow Elasticsearch to allocate the remaining shards and clean up any now-unneeded copies. You can try wiping nodes es-data3 and es-data6 instead but I don't know that this'll free up enough space to help, and you won't be able to move the allocated shards elsewhere since all the other nodes are full too. Alternatively, add more capacity (either more nodes, or more disk space on these nodes). As it stands, this cluster looks to be too full to work well.

defalt · July 10, 2020, 9:39am

I think you will need one or two new nodes or add some new drives if possible. Running elastic so close to the watermarks is not the best idea and leads to all those unassigned shards. This is pretty risky.
As @DavidTurner suggested it would be a good idea to delete old indices, maybe shrink some if possible.

tells me that there is a lot of other data on that node. Maybe you can delete some of it.
If you need more time for ordering new nodes maybe you could increase the watermarks a bit. But that wont help you in the long run.
To sum it up. Buy more storage.

Cheers,
defalt

system · August 7, 2020, 9:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch unassigned shards Elasticsearch	14	822	August 13, 2020
Is there a tool to rebalance shards across multiple path.data paths? Elasticsearch	5	827	March 2, 2021
Disappearing Data and Unassigned Shards Elasticsearch	5	821	July 6, 2017
Unassigned shards delaying allocation Elasticsearch	3	719	July 5, 2017
Elasticsearch shards are UNASSIGNED CLUSTER_RECOVERED after I changed path.data Elasticsearch	2	571	November 20, 2017

Old stale shards in path.data

Related topics