We are running a setup where we use daily indices for our logs. Then we delete a whole daily index when the disk becomes full. When I delete an index, elasticsearch responds with a json:{"acknowledged":true}
My question is, is it guaranteed that when elasticsearch responds to the delete request, all data related to the deleted index is actually deleted from disk? Is it guaranted that disk space is freed in all nodes of the cluster, for all primary and replica shards? Or is the {"acknowledged":true} response and indication that elasticsearch has started to delete the index, but maybe it will take a few more seconds to finish the operation and delete the data from all nodes?
To put it in another way, is i guaranteed that after {"acknowledged":true} is received, the df command will immediately show the disk space that the index used as available? Or it is possible that a few more seconds are needed?
In a few tests I did I see that the disk space is released before elasticsearch responds, however I'd like to know if this is guaranteed by the code.
The response will return acknowledged: true when the master node has received the request and applied it to the cluster state. However, there may be some time before the data is truly deleted from disk. The cluster state is published asynchronously to the other nodes, which then apply it locally and delete the data.
So for very large indices, there may be some noticeable lag between acknowledgement and actual disk space being freed. For smaller indices it tends to be very quick (basically just network lag time).
Thanks a lot for the prompt reply! This means that if I want to implement a system to automatically delete indices until disk usage drops below a certain threshold, I should not only rely on the acknowledged: true response, but also wait until disk usage has stabilized, before deleting a new index.
A related question: Is it safe to delete an active index, e.g. the index of today's logs, on which documents are continuously inserted? Or is it better to use the Close index API before deletion?
It's "safe", in that nothing will break or corrupt. But if you have auto-creation of indices enabled, new documents that are in-flight will hit the cluster, find out that there is no index and auto-create the index. So you'll get a new index in place of the one that was just deleted.
If you have auto creation disabled, it's not an issue... those in-flight docs will just error out.
Otherwise, the best thing to do is make sure you halt indexing (to those particular indices) before deleting
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.