Failed to read latest segment infos on flush

Any idea why I am getting:
2015-10-17 19:37:57,551][WARN ][index.engine.internal ] [ElasticSearch-18] [abc-2015.10.13][4] failed to read latest segment infos on flush
java.nio.file.FileSystemException: /var/data/elasticsearch4/Media/nodes/0/indices/abc-2015.10.13/4/index/_eqn.si: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)

more /etc/sysconfig/elasticsearch

Directory where the Elasticsearch binary distribution resides

ES_HOME=/usr/share/elasticsearch

Maximum number of open files

MAX_OPEN_FILES=65535

cat /etc/security/limits.conf

End of file

  • soft core unlimited
  • soft memlock unlimited
  • hard memlock unlimited

I'd check that your ulimit changes have been applied, you can get that from the _nodes API endpoint.

Can you please elaborate?
It is a 12 nodes cluster, with 3 masters and 9 data nodes

stats | grep open_file_descriptors
"open_file_descriptors" : 48827,
"open_file_descriptors" : 60585,
"open_file_descriptors" : 65211,
"open_file_descriptors" : 420,
"open_file_descriptors" : 64604,
"open_file_descriptors" : 63084,
"open_file_descriptors" : 64135,
"open_file_descriptors" : 453,
"open_file_descriptors" : 62657,
"open_file_descriptors" : 63350,
"open_file_descriptors" : 417,
"open_file_descriptors" : 64252,
the 420, 453 & 417 are from the master servers.
Please advice

You are so close to the 65k limit.
You might have too many shards on your cluster .

How many do you have?

Did you change any low-level index settings, e.g. for compound file format?

{
"cluster_name" : "abcmedia",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 12,
"number_of_data_nodes" : 9,
"active_primary_shards" : 4377,
"active_shards" : 8754,
"relocating_shards" : 2,
"initializing_shards" : 0,
"unassigned_shards" : 0
}

No, we have not made any configuration changes.

That's a massive number of shards for that many nodes, you need to reduce that.

Do you mean reduce the number of shards or increase the number of nodes?

Reduce the number of shards. Each shard is basically a separate Lucene
index, so each one will consume file resources.

Ivan