I'm running into a problem. I'm deleting multiple indices using a REST call in sense
DELETE index
But i am still finding some "legacy" files when i check the disk partition where elasticsearch stores its data.
I basically have 3 directories (0, 1 and 2) inside elastic's data directory
/elasticcluster/nodes/0
The 0 and 1 directories contains the legacy indices while 2 contain the current indices.
Any clue how to properly delete an index AND its corresponding documents ? And why is this happening ? Can I delete (Rm -Rf directories) by hand in order to clean old indices ?
These directories exist to handle running multiple copies of the Elasticsearch process from the same installation. They are super useful but sometimes you get weird stuff like this. Is there any chance that you are still running those instances of Elasticsearch? Like if you do ps aux | grep elastic do you see just the one or multiple? Did you happen to run multiple Elasticsearch processes and then kill two of them? Something like that.
The nasty part about your problem is that the data that you want is in the 2 directory but, normally, you have to run three copies of Elasticsearch to get that. Once you've figured out how you had three copies of Elasticsearch running and stopped that from happening again you ought to be able to just nuke the 0 and 1 directory and rename 2 to 0 and start up again. I'd be careful though - make a backup if you care about the data. If you don't care about it then it might be simpler to just shut down, remove all the node directories, and re import.
Thank you for your answer. I deleted the 1 and 2 directories, renamed 2 to 0. And restarted my cluster. And it worked. Thank you for your time
What is weird is that on the third node (i'm running a three node setup) the actual current indices resides in directory 0 while directory 1 contains legacy indices. The second node is ok though (only 0 directory).
So I'm wandering how could elastic even handle data and know which directories are the good ones across nodes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.