I have data in an ES index that dynamically changes. Various documents may
be deleted at any time and new ones are always being created.
Using the "delete by query" works perfectly to handle the deletes, but I've
read that this can be detrimental to index performance and I've stated to
see poor performance.
How detrimental is delete by query to an index and is there any way to
repair it? Does a optimize/forcemerge help with this problem?
Optimize/force merge does not help. It will only add load. If search times
increase, and deletes were massive, you should consider optimize. But not
too often, maybe once per hour.
Deleting single documents is expensive by nature of the algorithm, there is
no "repair".
I have data in an ES index that dynamically changes. Various documents may
be deleted at any time and new ones are always being created.
Using the "delete by query" works perfectly to handle the deletes, but
I've read that this can be detrimental to index performance and I've stated
to see poor performance.
How detrimental is delete by query to an index and is there any way to
repair it? Does a optimize/forcemerge help with this problem?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.