Too many open files


(onthefloorr) #1

Hi guys,
I have a little problem with elasticsearch..
Despite the fact that I set the number of open files on the system as
described on the site :
http://www.elasticsearch.org/tutorials/too-many-open-files/ , I always get
the error message (""too many open files"") from elasticsearch.

Here is the configuration :

cluster.name: xxxxxxx
index.cache.field.type: soft
index.cache.field.max_size: 10000
indices.store.throttle.type: merge
indices.store.throttle.max_bytes_per_sec: 20mb
indices.memory.index_buffer_size: 10%
index.refresh_interval: 30
index.translog.flush_threshold_ops: 20000
index.store.compress.stored: true
threadpool.search.type: fixed
threadpool.search.size: 4
threadpool.search.queue_size: 30
threadpool.bulk.type: fixed
threadpool.bulk.size: 4
threadpool.bulk.queue_size: 30
threadpool.index.type: fixed
threadpool.index.size: 4
threadpool.index.queue_size: 30
thrift.port: 9500
indices.recovery.concurrent_streams: 4
indices.recovery.max_bytes_per_sec: 40mb
cluster.routing.allocation.cluster_concurrent_rebalance: 20
indices.cache.filter.size: 100mb


Open files limit :
elasticsearch@xxxx:/$ ulimit -Sn
1000000
elasticsearch@xxxx:/$ ulimit -Hn
1000000
elasticsearch@xxxx:/$


Here is the log file :

I started elasticsearch with option es.max-open-files=true .

[2013-11-07 15:01:22,538][INFO ][bootstrap ]
max_open_files [65510]
[2013-11-07 15:01:23,131][INFO ][node ] [Wiz Kid]
version[0.90.5], pid[15414], build[c8714e8/2013-09-17T12:50:20Z]
[2013-11-07 15:01:23,131][INFO ][node ] [Wiz Kid]
initializing ...
[2013-11-07 15:01:23,518][INFO ][plugins ] [Wiz Kid]
loaded [transport-thrift], sites [HQ, head]
[2013-11-07 15:02:31,792][INFO ][node ] [Wiz Kid]
initialized
[2013-11-07 15:02:31,894][INFO ][node ] [Wiz Kid]
starting ...
[2013-11-07 15:02:31,911][INFO ][thrift ] [Wiz Kid]
bound on port [9500]
[2013-11-07 15:02:32,100][INFO ][transport ] [Wiz Kid]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address
{inet[/172.31.11.162:9300]}
[2013-11-07 15:02:35,129][INFO ][cluster.service ] [Wiz Kid]
new_master [Wiz Kid][ftUK_RPoSAqTfFEDYMFGuw][inet[/172.31.11.162:9300]],
reason: zen-disco-join (elected_as_master)
[2013-11-07 15:02:35,141][INFO ][discovery ] [Wiz Kid]
elasticsearch_logs/ftUK_RPoSAqTfFEDYMFGuw
[2013-11-07 15:02:35,228][INFO ][http ] [Wiz Kid]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address
{inet[/172.31.11.162:9200]}
[2013-11-07 15:02:35,228][INFO ][node ] [Wiz Kid]
started
[2013-11-07 15:02:55,523][INFO ][gateway ] [Wiz Kid]
recovered [208] indices into cluster_state


Exception:

[2013-11-07 15:00:23,343][DEBUG][action.bulk ] [Darkdevil]
[logs-2013-11-07][2] failed to execute bulk item (index) index
{[logs-2013-11-07][logs][G6KBeJFeRoiwKaLXZUKq-g],
source[{"message":"xxxxxxxxxx","host":null,"date":"2013-11-07T13:57:45.917Z"}]}
org.elasticsearch.index.engine.CreateFailedEngineException:
[logs-2013-11-07][2] Create failed for [logs#G6KBeJFeRoiwKaLXZUKq-g]
at
org.elasticsearch.index.engine.robin.RobinEngine.create(RobinEngine.java:369)
at
org.elasticsearch.index.shard.service.InternalIndexShard.create(InternalIndexShard.java:331)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:402)
at
org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:155)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:533)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:418)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException:
/var/lib/elasticsearch/elasticsearch_xxxxxxxxx/nodes/0/indices/logs-2013-11-07/2/index/_1kgc.fdt
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:241)
at
org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:466)
at
org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:288)
at
org.apache.lucene.store.RateLimitedFSDirectory.createOutput(RateLimitedFSDirectory.java:41)
at
org.elasticsearch.index.store.Store$StoreDirectory.createOutput(Store.java:419)
at
org.elasticsearch.index.store.Store$StoreDirectory.createOutput(Store.java:409)
at
org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:62)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsWriter.(CompressingStoredFieldsWriter.java:109)
at
org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsWriter(CompressingStoredFieldsFormat.java:120)
at
org.apache.lucene.index.StoredFieldsProcessor.initFieldsWriter(StoredFieldsProcessor.java:88)
at
org.apache.lucene.index.StoredFieldsProcessor.finishDocument(StoredFieldsProcessor.java:120)
at
org.apache.lucene.index.TwoStoredFieldsConsumers.finishDocument(TwoStoredFieldsConsumers.java:65)
at
org.apache.lucene.index.DocFieldProcessor.finishDocument(DocFieldProcessor.java:264)
at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:283)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:432)
at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1513)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1188)
at
org.elasticsearch.index.engine.robin.RobinEngine.innerCreate(RobinEngine.java:470)
at
org.elasticsearch.index.engine.robin.RobinEngine.create(RobinEngine.java:364)
... 8 more


There are 1040 shards . ( 5 shard / index ) .
XMS and XMX of elasticsearch were set to 4g.

The machine has 8 GB of memory, a Intel Xenon 5560 3GHZ 4 cores CPU.

Could anyone advise me how to configure elasticsearh ?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

It looks like you run a single node? If so, you have too many shards for a
single node. 1040 shards is unusual high. Note that each shard can consume
around 100-200 file descriptors while being active in the background, so it
is quite possible you have exceeded 65510.

Check curl localhost:9200/_nodes/process?pretty or OS utilities for the
number of used file descriptors of the JVM.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(田传武) #3

On Thu, Nov 7, 2013 at 10:36 PM, onthefloorr onthefloorr@gmail.com wrote:

[2013-11-07 15:01:22,538][INFO ][bootstrap ]
max_open_files [65510]

The settings didn't take effect, a reference config steps blow:

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(onthefloorr) #4

Yes, I using elasticsearch on a single node, and it looks like the number
of file descriptors exceeded the maximum allowed limit on this machine :frowning:

The "_nodes/process?pretty" writes that the maximum_file_descriptor is
65535. It seems the jvm only can use 65535 files at once, or could I get it
to use more ? If I can't then I have to add a new node ? or there is an
another option ?

Thank you in advance for your answer.

On Thursday, November 7, 2013 4:12:43 PM UTC+1, Jörg Prante wrote:

It looks like you run a single node? If so, you have too many shards for a
single node. 1040 shards is unusual high. Note that each shard can consume
around 100-200 file descriptors while being active in the background, so it
is quite possible you have exceeded 65510.

Check curl localhost:9200/_nodes/process?pretty or OS utilities for the
number of used file descriptors of the JVM.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #5

Check the "nofile" setting in /etc/security/limits.conf for the maximum
limit of number of file descriptors. You can change it with superuser
privileges. I do not recommend a file descriptor hard limit higher than
in /proc/sys/fs/file-max

Yes, you should add nodes if you want to manage a high number of
active/open shards.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(onthefloorr) #6

Thank you for your answer. So then I will buy a new machine :slight_smile:

On Thursday, November 7, 2013 9:55:52 PM UTC+1, Jörg Prante wrote:

Check the "nofile" setting in /etc/security/limits.conf for the maximum
limit of number of file descriptors. You can change it with superuser
privileges. I do not recommend a file descriptor hard limit higher than
in /proc/sys/fs/file-max

Yes, you should add nodes if you want to manage a high number of
active/open shards.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #7

You can also set number of shards to 1 instead 5 I guess.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 8 nov. 2013 à 08:26, onthefloorr onthefloorr@gmail.com a écrit :

Thank you for your answer. So then I will buy a new machine :slight_smile:

On Thursday, November 7, 2013 9:55:52 PM UTC+1, Jörg Prante wrote:

Check the "nofile" setting in /etc/security/limits.conf for the maximum limit of number of file descriptors. You can change it with superuser privileges. I do not recommend a file descriptor hard limit higher than in /proc/sys/fs/file-max

Yes, you should add nodes if you want to manage a high number of active/open shards.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(pellyadolfo) #8

Hi, my model is quite slow with just about some thousands documents....

I realised that, when opening a

        node = 

builder.client(clientOnly).data(!clientOnly).local(local).node();
client = node.client();

from my Java program to ES with such a small model, ES automatically
creates 10 sockets. Casually I have 10 shards (?).

  • Is this the expected behavior?
  • Can I reduce the number of ES shards dynamically to reduce the number of
    sockets or should I redeploy my ES install?
  • By opening other connections I finally get up to 200 simultaneous open
    sockets and, I am afraid, that, when fetching highlight information, some
    of the results are randomly being lost. Can this missing results be somehow
    as a consequence of a too large number of open sockets?

Thanks for your pointers.

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c0a4660-ef70-491d-998f-5ed73c4a9025%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(pellyadolfo) #9

I guess, my problem with excessive number of sockets could be also a
consequence of having 2 JVM running ES, one embedded in Tomcat, a second
embedded in other Java app, as said here:

https://groups.google.com/forum/?hl=en-GB#!topicsearchin/elasticsearch/scale|sort:date|spell:true/elasticsearch/m9IWpGzoLLE

Is there any experience running an unique embedded ES (as jar files), for
example, in tomcat's lib folder, being consumed by several tomcat apps and
other standalone apps in different JVMs?

Any opinion on this configuration as an starting point?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3ba7b377-9b66-4d8b-ad65-de362318f9f2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(pellyadolfo) #10

Happily, the problem of missing highlight records looks to be gone by
making a config change.

  • Initially I had 2 ES in 2 different apps (a Tomcat and a standalone)
    configured equal (both listening for incoming TransportClients requests on
    port 9300 and both open with client(false)) and a third ES connecting to
    then opened with new TransportClient() to fetch highlighting info. It looks
    that this third ES was randomly loosing highlighting records. (?)

  • What I did to fix it was a configuration change to have only one
    client(false)) ES listening for TransportClients and 2 new
    TransportClient()s connecting to it.

It looks this change fixes the issue which was some kind of coupling
between both client(false)) ESs listening on port 9300.

Regards

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fbe72b9f-eeac-4d2b-9545-6851352aa3d5%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #11