Our Elasticsearch cluster has two data directories. We recently restarted all the nodes in the cluster. After the successful restart process, we observed increased disk space usage on few nodes. When we examined the folders inside the data directory, we found that there are orphaned shards. For example, an orphaned shard "15" exists at location data_dir0/cluster_name/nodes/0/indices/index_name/15, while one of the replicas of the same shard "15" exists on the same node inside other data directory, here at data_dir1/cluster_name/nodes/0/indices/index_name/15. This shard "15" from data_dir1 is also included in cluster metadata and thus, we assume that shard "15" from data_dir0 is an orphaned shard and has to be deleted by Elasticsearch. But Elasticsearch hasn't deleted the orphaned shard yet, even after 6 days since last restart.
We found this topic Old shards on re-joining nodes useful? relating to our issue but it did not help us as in ES did not take care of that orphaned shard.
Any help to delete orphaned shards and recover the disk space is highly appreciated.
Is the cluster green? Shard data is only deleted if there are enough shard copies in the cluster (i.e. the shard is fully allocated with no unassigned copies).
Sorry, I misread your first post. The issue is that cleanup of shard data on one data path does not happen if the shard is allocated on the same node on another data path. The question is why the node decided to allocate the shard on a different data path (it will normally reuse the same path if there is shard data already there). Is it possible that the shard on the previous data path was never fully allocated? (i.e. initializing but never started). What ES version is this?
Can you provide the folder content of both directories?
tree data_dir0/cluster_name/nodes/0/indices/index_name/15 and tree data_dir1/cluster_name/nodes/0/indices/index_name/15.
I'm in particular looking for a state-* file in the orphaned shard folder.
We don't have logs/proofs about the shard distribution before. So, we cannot say if the old shard was fully allocated. Our ES version is 2.3.3. The folder contents on both the disks are:
The missing directory /storage/disk0/elasticsearch/cluster_name/nodes/0/indices/index_name/_state indicates that the shard failed to be fully allocated to the node in an earlier recovery attempt. When allocating a shard, ES uses the directory which has already shard data. This directory is identified by the _state file (which is missing in this case due to the unfinished recovery attempt). In your case, the node picked the other data directory to allocate the shard because it could not see the existing shard. The stale shard data was also not cleaned up as ES only deletes shard directories if the shard is not allocated to the current node.
The first issue will not occur on ES v5.x as the recovery process has been changed in that regard. I think there is no easy fix here except to manually delete the directories.
Thanks for the help @ywelsch. We do not want to delete the Elasticsearch folders manually as we are dealing with this problem in production. Any official link to the manual delete suggestion will be very helpful. Else, we are going to replace the affected nodes one by one even though it will incur huge data transfer costs.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.