So i'm on 1.7.x and i did an optimize on a large 10's of TBs index (haven't done one in a while) and all went well, the index size went down.
I am currently optimizing another, however, the size has gone up amost 25% so far, with no signs of slowing down. what could be the issue?
both had setting for only_expunge_deletes=true, first index has fewer writes too it, wile this second one is updated constantly. first index documents counts ie 5,000 (9,999) went down pretty quick ie 5,000(9,999) to 5,000 (6,999)
second one, the doc counts are not really budging at all, especially for deleted counts (within parenthesis)
I am in planning already to move everything to 6.0, but this is an immediate problem on 1.7.x
As segments merge, extra space will initially be consumed. It should free back up again as the source segments are marked for deletion, and then deleted (after successful re-indexing to larger segment). It could take a long time for this to happen, though.
now I am wondering whether sparce doc values is the culprit, there are maybe 5% (guessing) of docs with some fields that have kbs of data in a few fields of each docs, while other docs those same fields were empty. i see that lucene doc value files are getting big.
If i have to terminate the merge, will this stop the size explosion? how do i stop the optimize?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.