File descriptors: is it ok to go beyond 32k or 64k?

I've been researching this but was hoping to understand more of the underlying reasons. The official documentation recommends increasing the open file descriptor limit to 32k or 64k. In our cluster we currently have 12,800 shards on 5 data nodes. This leads to approximately 40k file descriptors on each node. The data nodes are CentOS with 64GB of memory.

Are there reasons we don't want to have more than 64k file descriptors open on a node? Is the concern about memory needed for each descriptor or are there other resource issues?

Is the 32k or 64k limit still applicable for nodes with higher amounts of memory?


Not really, you can definitely set it higher.

That's waaaaaay too much. You need to address that ASAP.

Mark, Thanks for the reply.

As for the number of shards, why is that an issue?

Right now our queries are normally around 1 per second but the latency is usually around 20ms. Our current indexing averages between 2,000 and 3,000 docs per second while latency is .3ms. Will the number of shards per node be an issue when we have more queries?

Thanks again,

How much data in the cluster? How many indices?

Currently, we have 1,043 indexes. 5 shards per index (primary and 2 replicas). We added another replica some time ago which is why it doesn't add up to 12,800 shards. There's 7TBs of data.


2 replicas seems excessive for 5 nodes.

Basically you have over 2500 shards per node. Each shard is a lucene instance and requires a certain amount of resources to manage. That's a lot of your heap taken just to manage shards.