Deleted indices partially come back as dangling indices on node/cluster restarts


(Chris Fraschetti) #1

ElasticSearch 1.7.1

Similar issue on stack overflow

I have indices per day (e.g. myindex_yyyyMMdd) and under normal operating conditions it appears my rolling window(s) are working just fine and with all nodes connected to my cluster, deletes execute successfully and the cluster status and master logs reflect the deletion.

However I've seen under two different scenarios (both involving taking a previously active/healthy node offline and bringing it back online) where long since deleted indices (well outside my rolling window) come back as dangling indices. In this interesting scenario where the index has been deleted, the cluster is unable to fully recover (no surprise there, the shards should actually be deleted) and of course my cluster stays in the red until the dangling indices are deleted.

I have a handful of clusters all running ElasticSearch 1.7.1, all of which show this behavior from time to time but not consistently.

I've seen this happen both with full cluster restarts (shut down the entire cluster) as well as individual node restarts.

I fully understand the dandling index concept, particularly around nodes that may have been offline when the deletions had occurred but not for nodes that were active/healthy at the time of deletion.

Any thought/input would be appreciated.


(Mark Walkom) #2

If/when this happens again, check the filesystem to see if they still exist there.
It's odd that it'd accept the delete but then not do it.


(Chris Fraschetti) #3

There may be multiple scenarios where this occurs but for the ones that have hit my radar, the filesystem does actually show the index folder.
Truth be told, I didn't go as far as to check creation dates and/or size of the contents (I'll have to do that next time).
I'll take a look at the timestamps/sizes next time this occurs (hopefully it doesn't) to try to determine if the deletion failed to delete from the FS or if somehow a rogue clusterstate convinced the cluster it needed to recreate the cluster shards for some reason (is that even possible?).


(Mark Walkom) #4

It's possible you have a split brain I guess, check _cat/master and also _node and make sure they all agree.


(system) #5