I need to delete some documents with a particular field with "Delete by query". All is ok, 10 millions are deleted, I go from 14 million to over 4 million.
Before the removal, I was at 9,8 giga. After the removal... 9.3. I just won 500mo ?!
That's the way delete documents works.
It creates new files marking documents as deleted.
Then after a period of time everything is compacted and you can get the space back.
Note that you can run forcemerge API to force this. It will use a lot of IOs though.
That's the reason it's often better to reindex 4m docs than deleting 10m!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.