we have a problem with open file handles of deleted lucene indices on one
elasticsearch instance, and I am not sure how to figure this out.
Setup: ES 0.19.3 with result grouping, one index, plus FST suggester (where
I suspect the leak, as it is my code).
After a bug in our river, the ES instance imported like 60 products per
second constantly for several days. It always imported the same documents
(several ten thousands) every n minutes and then immediately restarted
after 30 seconds break.
This slowly filled up the available disk space, because lucene segments
were deleted but the filehandle was still kept open, lsof looks like this:
java 2695 elasticsearch 6783r REG 251,0 275
java 2695 elasticsearch 6784r REG 251,0 1293
java 2695 elasticsearch 6785r REG 251,0 12
java 2695 elasticsearch 6787r REG 251,0 2592
java 2695 elasticsearch 6790r REG 251,0 20
There are around 6500 deleted files open concurrently.
I fixed this by calling also closing the indexReader instance I used in my
fst-suggest plugin. This somewhat changed the behaviour of my problem.
When not closing the indexreader, the es instance had lots of open files
and ate all the diskspace. Now i changed the problem behaviour to not
consume the diskspace but still having tons of open deleted files lurking
The inMemory structure I am using for my suggest feature contains an
IndexReader, a SpellChecker, and FSTLookup and a ShardId.
Are there any ES resources I need to take care of additionally, before
writing to the lucene mailinglist?
Thanks for any pointers in this regard, my Lucene knowledge is not the best