I'm seeing around 1380 shards unassigned after a recent restart of a node in the cluster. This cluster has 20 nodes with 49122 shards with around 17TB of data. We're currently running version 2.4.0. With one node offline in the cluster I used the delete api to delete some indexes. Next I tried restarting the node and observed the following logs....
This log repeats for all of the indexes that were previously deleted. I'm seeing this log for indexes we deleted previously (meaning weeks ago) in addition to the ones I deleted with the node down earlier today. What steps can I take to completely remove references to these deleted shards as we no longer need them. Any advice here is greatly appreciated!
Thank you for your reply. One follow up question. Are there known differences (perhaps consequences) between using curator vs the delete API to completely remove indices? The dangling indices referenced above were deleted using only the curl -XDELETE command. Is it safer or better to use the curator commands to clean up old unwanted indices (remove, close, delete curator commands)? For instance I noticed the remove command will remove the index from an alias, however the delete API doesn't do this correct? Is it possible to get a dangling index by using the DELETE api or is that more related to a node being offline or maybe both? Any thoughts would be appreciated.
One more more follow up on this as I have a little more information. So I noticed that two nodes seem to have references to the deleted indexes in their data dirs:
es/data-es3/sv/nodes/1/indices/xyz-2016-08-26/ --> Was deleted using the DELETE api
es/data-es3/sv/nodes/1/indices/abc-2016-05-12/ ---> Was deleted using curator (remove,close,delete)
Overall I'm seeing traces of indexes that I deleted in the data dirs regardless of deletion method (curator vs. delete). That being said, I want to avoid the issue of dangling indexes/ unassigned shards in the future if/when I need to reboot these nodes. I know we don't care about these indexes anymore. Is it safe to delete the corresponding data directories to remove traces of these indices? Can I do this while the cluster is running? Any advice is greatly appreciated!
Curator is calling DELETE api to delete indices. So, you should have the same result regardless of how you call.
The issue happens in the following scenario that is possible on a pre-5.x cluster
the index foo is created, it allocates its shards on some nodes including the node nodeA
nodeA is shut down
the index foo is deleted from the cluster and all traces of this index are removed from file systems of all nodes that are in the cluster and the cluster stat. nodeA still have these files because it wasn't part of the cluster when the index was deleted.
nodeA rejoins the cluster, the cluster finds some shards that belong to now unknown index foo. What do we do? We could have just ignored the finding and just deleted this files... but, just in case, we try to import this files as a dangling index thinking that annoyance of dealing with this index by deleting it twice is better than potentially losing data.
So, I see 3 possible ways to solve this issue:
avoid scenario described above - by not deleting indices while nodes are restarting (probably not very practical)
clean the file system on the nodes that were shutdown before allowing them to rejoin the cluster by removing directories corresponding to the indices that have been deleted (a bit more practical but requires some coding and might lead to data loss if not implemented correctly)
upgrade the cluster to 5.x where this problem was solved by keeping track of deleted indices in the cluster state for some period of time after they were deleted.
Thanks for the detailed explanation! So I took the route of deleting the indexes a second time with all nodes in the cluster. However, I'm noticing that it didn't clean up the file system for those indices on a couple of nodes. Does that sound expected? I was expecting everything to cleaned up after the second delete run.
Bottom line is the deleted indexes are still on the file system. For the time being, I may have to go with option two to clean this up properly. That being said, can you confirm this would be the proper procedure to accomplish this clean up:
Take down node with references to deleted indexes.
Delete the index directory and all of it's contents. For example
rm -rf /es/data-es3/sv/nodes/1/indices/xyz-2016-08-26/
Could you double check? That shouldn't really be the case. If an index is deleted while a node is in the cluster, these folders should cleaned. You should see non-empty directories only if something is still holding these indices, and I wouldn't advise deleting them in this case.
Sure I'll reply once I double check. My fear is that if those problematic nodes go down, that I'll run into the same issue when I restart them. The problem I'm facing is that it's taking a pretty long time to bring the node up with fully initialized shards b/c our shard count is definitely too high. So overall I'm just trying to understand what I can do to avoid this issue on the next restart.
How would I confirm if something is holding on to these indices still? Would that be via the _cat/shards API?
Ok so I just did a spot check, each of the directories in question have just one file ./elasticsearch/.../1/indices/xyz-index-2016-09-07/_state/state-x.st where x is 0,1,3. I'm not seeing these files referenced in an lsof output: sudo lsof | grep 'xyz-index-2016'
curl -XDELETE http://..../abc-2016-10-02/ -- It returns {"acknowledged":true}. Then I checked the file system for the problematic node and I can see it's still being referenced. It has that same directory structure/pattern ./elasticsearch/..../nodes/1/indices/abc-2016-10-02/_state/state-0.st. This state-0.st file is pretty old though from 7/14. No references to that file in the lsof output. Any thoughts?
Then it might be some sort of a bug in 2.x. Deleting directories manually sounds like a reasonable solution, but since I don't fully understand the mechanism that leads to this issue in the first place, I cannot really guarantee that your solution would not cause any issues. I think the right solution would be to upgrade to 5.x, where this problem was solved properly.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.