Documents still exist after index deletion

iliasse · March 24, 2016, 1:48pm

Hi,

I'm running into a problem. I'm deleting multiple indices using a REST call in sense

DELETE index

But i am still finding some "legacy" files when i check the disk partition where elasticsearch stores its data.
I basically have 3 directories (0, 1 and 2) inside elastic's data directory
/elasticcluster/nodes/0
The 0 and 1 directories contains the legacy indices while 2 contain the current indices.

Any clue how to properly delete an index AND its corresponding documents ? And why is this happening ? Can I delete (Rm -Rf directories) by hand in order to clean old indices ?

Thank you.

PS : I am using elastic 2.2

nik9000 · March 24, 2016, 2:07pm

These directories exist to handle running multiple copies of the Elasticsearch process from the same installation. They are super useful but sometimes you get weird stuff like this. Is there any chance that you are still running those instances of Elasticsearch? Like if you do ps aux | grep elastic do you see just the one or multiple? Did you happen to run multiple Elasticsearch processes and then kill two of them? Something like that.

The nasty part about your problem is that the data that you want is in the 2 directory but, normally, you have to run three copies of Elasticsearch to get that. Once you've figured out how you had three copies of Elasticsearch running and stopped that from happening again you ought to be able to just nuke the 0 and 1 directory and rename 2 to 0 and start up again. I'd be careful though - make a backup if you care about the data. If you don't care about it then it might be simpler to just shut down, remove all the node directories, and re import.

iliasse · March 24, 2016, 2:29pm

Hi Everett,

Thank you for your answer. I deleted the 1 and 2 directories, renamed 2 to 0. And restarted my cluster. And it worked. Thank you for your time

What is weird is that on the third node (i'm running a three node setup) the actual current indices resides in directory 0 while directory 1 contains legacy indices. The second node is ok though (only 0 directory).
So I'm wandering how could elastic even handle data and know which directories are the good ones across nodes.

I'm running elastic inside a Docker, One single instance per node. This problem happened after a hazardous shut down that i've mentioned here : Elastic inside Docker loses docs when shutdown while bulk indexing

Thank you for your time.

Topic		Replies	Views
My Old, Bad, Data Came Back Elasticsearch	5	938	July 6, 2017
Expunge deleted data? Elasticsearch	6	1392	July 5, 2017
Deleted index shows up after cluster restart Elasticsearch	5	1200	July 6, 2017
Elasticsearch server issue Elasticsearch	3	333	July 6, 2017
Deleting docs in elasticsearch Elasticsearch	3	321	July 6, 2017

Documents still exist after index deletion

Related topics