Documents still exist after index deletion


#1

Hi,

I'm running into a problem. I'm deleting multiple indices using a REST call in sense

DELETE index

But i am still finding some "legacy" files when i check the disk partition where elasticsearch stores its data.
I basically have 3 directories (0, 1 and 2) inside elastic's data directory
/elasticcluster/nodes/0
The 0 and 1 directories contains the legacy indices while 2 contain the current indices.

Any clue how to properly delete an index AND its corresponding documents ? And why is this happening ? Can I delete (Rm -Rf directories) by hand in order to clean old indices ?

Thank you.

PS : I am using elastic 2.2


(Nik Everett) #2

These directories exist to handle running multiple copies of the Elasticsearch process from the same installation. They are super useful but sometimes you get weird stuff like this. Is there any chance that you are still running those instances of Elasticsearch? Like if you do ps aux | grep elastic do you see just the one or multiple? Did you happen to run multiple Elasticsearch processes and then kill two of them? Something like that.

The nasty part about your problem is that the data that you want is in the 2 directory but, normally, you have to run three copies of Elasticsearch to get that. Once you've figured out how you had three copies of Elasticsearch running and stopped that from happening again you ought to be able to just nuke the 0 and 1 directory and rename 2 to 0 and start up again. I'd be careful though - make a backup if you care about the data. If you don't care about it then it might be simpler to just shut down, remove all the node directories, and re import.


#3

Hi Everett,

Thank you for your answer. I deleted the 1 and 2 directories, renamed 2 to 0. And restarted my cluster. And it worked. Thank you for your time :slight_smile:

What is weird is that on the third node (i'm running a three node setup) the actual current indices resides in directory 0 while directory 1 contains legacy indices. The second node is ok though (only 0 directory).
So I'm wandering how could elastic even handle data and know which directories are the good ones across nodes.

I'm running elastic inside a Docker, One single instance per node. This problem happened after a hazardous shut down that i've mentioned here : Elastic inside Docker loses docs when shutdown while bulk indexing

Thank you for your time.


(system) #4