In my experiment, I indexed about 80 million documents and I observed the number of open files being high.
My setup is as follows:
- number of nodes in the cluster: 2
- number of indices: 32
- number of shards (for each index): 32
- number of replicas: 1
The number of open files observed were:
The number of open files was increasing as time went on.
(I eventually had out of memory exception, so I had to stop my experiment there.)
With my setup, it creates 32 indices x 32 shards x 2 replicas / 2 nodes = 1024 Lucene indices per node.
I know the number of shards is too big for this environment, but this was an experiment.
Is about 50,000 open files for about 1000 Lucene indices as expected?
Is the number somewhat proportional to number of Lucene indices?
My observation is that the number of files in each Lucene index directory varies (100 to 200).
I think, correct me if I'm wrong, the number increase as more documents are indexed and decreases when optimization runs.
In my current system setup, I set the max limit of number of open files per process to 100,000.
Can I say this is ok setup for about 1000 Lucene indices?
Thanks for your help.