Failed to read latest segment infos on flush


#1

Any idea why I am getting:
2015-10-17 19:37:57,551][WARN ][index.engine.internal ] [ElasticSearch-18] [abc-2015.10.13][4] failed to read latest segment infos on flush
java.nio.file.FileSystemException: /var/data/elasticsearch4/Media/nodes/0/indices/abc-2015.10.13/4/index/_eqn.si: Too many open files
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)

more /etc/sysconfig/elasticsearch

Directory where the Elasticsearch binary distribution resides

ES_HOME=/usr/share/elasticsearch

Maximum number of open files

MAX_OPEN_FILES=65535

cat /etc/security/limits.conf

End of file

  • soft core unlimited
  • soft memlock unlimited
  • hard memlock unlimited

(Mark Walkom) #2

I'd check that your ulimit changes have been applied, you can get that from the _nodes API endpoint.


#3

Can you please elaborate?
It is a 12 nodes cluster, with 3 masters and 9 data nodes

stats | grep open_file_descriptors
"open_file_descriptors" : 48827,
"open_file_descriptors" : 60585,
"open_file_descriptors" : 65211,
"open_file_descriptors" : 420,
"open_file_descriptors" : 64604,
"open_file_descriptors" : 63084,
"open_file_descriptors" : 64135,
"open_file_descriptors" : 453,
"open_file_descriptors" : 62657,
"open_file_descriptors" : 63350,
"open_file_descriptors" : 417,
"open_file_descriptors" : 64252,
the 420, 453 & 417 are from the master servers.
Please advice


(David Pilato) #4

You are so close to the 65k limit.
You might have too many shards on your cluster .

How many do you have?


(Michael McCandless) #5

Did you change any low-level index settings, e.g. for compound file format?


#6

{
"cluster_name" : "abcmedia",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 12,
"number_of_data_nodes" : 9,
"active_primary_shards" : 4377,
"active_shards" : 8754,
"relocating_shards" : 2,
"initializing_shards" : 0,
"unassigned_shards" : 0
}


#7

No, we have not made any configuration changes.


(Mark Walkom) #8

That's a massive number of shards for that many nodes, you need to reduce that.


#9

Do you mean reduce the number of shards or increase the number of nodes?


(Ivan Brusic) #10

Reduce the number of shards. Each shard is basically a separate Lucene
index, so each one will consume file resources.

Ivan


(system) #11