I am working on purging the old data in elasticsearch. I was wondering what is the best way to implement such a job. I have to delete all the documents which are older than 3 months and this has to repeated every 3 months. As per the documentation it is advised to delete the index itself( as it is computationally efficient) but in this case I am not sure how am I supposed to recreate the index.
Can I use the watcher to implement such a job? Will it be efficient enough? Or should I take some other alternative?
You can use Watcher to do what you're trying to do. I would give it a shot and see if the performance is fast enough for your needs. If it isn't you could also try reindexing your data so that each index contains documents from a single day. Then you can use our Curator tool to select indices based on age and then delete them.
+1 on CJs advice on using curator for this - for now. I think it would be easier to handle, than a long running watch running a delete by query statement.
Would it be easier to have monthly indices, and then just delete the old one? You can just search across all of them when you execute a search.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.