After took effect to delete document by Query API. But it always keep under docs.deleted. Here you are to see:
[user@elk ~]$ curl -XGET "http://elasticsearch:9200/_cat/indices/myindex ?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open myindex STGtht6tes2upE4WrSOKTg 5 1 46420892 3271000 29.3gb 14.3gb
In additional, it still occupied the store.size.
If anyone know how to release the deleted size. Restart all cluster node?
You mean need to _forcemerge and will really delete it?
If user timebased indices, it will delete whole indice, which is not my expectation. So that I use _delete_by_query.
But my version of it is a way much more efficient than your way!
So if it's a one time only operation, using the DELETE BY QUERY could be fine as long as you remove only a small subset of the data. Otherwise, if you meant to keep like 10% of the data, it's better to use REINDEX API and drop the old index.
If it's a task you are going to run every day, then use timebased indices and then use Curator to automate that index removal every day.
Oh. One thing to be mentioned, I only have one index to store all data, not separate them by date postfix for indice name. So I only one way to delete the document of indice instead of indices.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.