Hi,
I just wanted to ask about a phenomenon I observed a few times. When
optimizing an index sometimes some shards dont get merged completely
throwing an exception like
"failed to merge
org.apache.lucene.index.CorruptIndexException: docs out of order (10251725
<= 10278672 ) (out:
org.apache.lucene.store.FSDirectory$FSIndexOutput@5379138)"
Even I am not completely sure what causes this issue in our case this is
down to lucene and not really the subject here. My point is the following:
last time I observed that at a replica, while the primary shard was ok.
Wouldnt it make sense on elasticsearch side to catch such an exception and,
if possible, rebuild such a corrupt shard from another shard (probably in
background, replace replica when finished and just then sent acknowledge
for optimization)? It would certainly help maintaining an index cluster, or
what do you think?
Thanks!
Andrej
--