I'm curious if anyone knows lucene enough to explain what the max number of documents or data size should exist in a segment. As in at what point does forcemerging an index have diminishing returns? If I have a 50GB index with 300M documents, is it better to have 1 5 or 10 Segments?
The max doc limit works on a shard level, not a segment, and it's 2^32-1.
This depends on what you are trying to achieve by the force merge. If you are looking to speed up queries I suspect you need to benchmark to see what the ideal segment size is. If you on the other hand are looking to reduce heap usage, force merging down to a single shard is always beneficial, as described in this webinar.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.