Too many files open


(mkleen) #1

Hello,

i am running Elasticsearch on a Cent-OS and got the following error
after running elastisearch for a couple of days. At the moment I have
one single Instance of Elasticsearch running. How can I avoid to have
too many files open ? Do i have a wrong setup here ?

 failed recovery]; nested:

EngineCreationFailureException[[4e47da12d50c1f1bceeeb795][1] Failed to
open reader on writer]; nested: FileNotFoundException[/var/lib/
elasticsearch/test/nodes/0/indices/4e47da12d50c1f1bceeeb795/1/index/
segments_1 (Too many open files)]; ]]
[2011-08-15 06:02:01,740][WARN ][indices.cluster ] [Flygirl]
[4e47da12d50c1f1bceeeb795][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[4e47da12d50c1f1bceeeb795][0] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService
$1.run(IndexShardGatewayService.java:229)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by:
org.elasticsearch.index.engine.EngineCreationFailureException:
[4e47da12d50c1f1bceeeb795][0] Failed to create engine
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:
251)
at
org.elasticsearch.index.shard.service.InternalIndexShard.start(InternalIndexShard.java:
254)
at
org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:
146)
at org.elasticsearch.index.gateway.IndexShardGatewayService
$1.run(IndexShardGatewayService.java:179)
... 3 more
Caused by: java.io.FileNotFoundException: /var/lib/elasticsearch/test/
nodes/0/indices/4e47da12d50c1f1bceeeb795/0/index/segments_1 (Too many
open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:233)
at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput
$Descriptor.(SimpleFSDirectory.java:69)
at org.apache.lucene.store.SimpleFSDirectory
$SimpleFSIndexInput.(SimpleFSDirectory.java:90)
at org.apache.lucene.store.NIOFSDirectory
$NIOFSIndexInput.(NIOFSDirectory.java:91)
at
org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:
78)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:
345)
at org.elasticsearch.index.store.support.AbstractStore
$StoreDirectory.openInput(AbstractStore.java:356)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:262)
at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:
359)
at org.apache.lucene.index.SegmentInfos
$FindSegmentsFile.run(SegmentInfos.java:750)
at org.apache.lucene.index.SegmentInfos
$FindSegmentsFile.run(SegmentInfos.java:589)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:355)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1144)
at
org.elasticsearch.index.engine.robin.RobinEngine.createWriter(RobinEngine.java:
1242)
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:
249)
... 6 more

My configuration is as followed:

# Cluster Settings
cluster:
  name: search

# Server Address
#network :
#   host : 10.0.0.4

# Paths
#path:
#  logs: /var/log/elasticsearch
#  data: /var/data/elasticsearch

# Gateway Settings
#gateway:
#  recover_after_nodes: 1
#  recover_after_time: 5m
#  expected_nodes: 2

# Index Settings
index :
  number_of_shards : 2
  number_of_replicas : 1
  analysis :
    analyzer :
      filename :
        tokenizer : letter
        filter : [standard, lowercase, autocomplete]
      default :
        tokenizer : standard
        filter : [standard, lowercase, autocomplete]
    filter :
      autocomplete :
        type : edgeNGram
        min_gram : 3
        max_gram : 15
        side : front

I also started elasticsearch from my init.d script with ulimit -n
20000 to limit it to have max. 20000 files open.

start() {
echo -n $"Starting ${NAME}: "
ulimit -n 20000

}

Many Thanks,

Michael


(Shay Banon) #2

Which version are you using? The first thing that I would do, is check what
the actual number of open files is for the process, the nodes info API
provides that in 0.17.

On Mon, Aug 15, 2011 at 1:19 PM, mkleen mkleen@gmail.com wrote:

Hello,

i am running Elasticsearch on a Cent-OS and got the following error
after running elastisearch for a couple of days. At the moment I have
one single Instance of Elasticsearch running. How can I avoid to have
too many files open ? Do i have a wrong setup here ?

    failed recovery]; nested:

EngineCreationFailureException[[4e47da12d50c1f1bceeeb795][1] Failed to
open reader on writer]; nested: FileNotFoundException[/var/lib/
elasticsearch/test/nodes/0/indices/4e47da12d50c1f1bceeeb795/1/index/
segments_1 (Too many open files)]; ]]
[2011-08-15 06:02:01,740][WARN ][indices.cluster ]
[Flygirl]
[4e47da12d50c1f1bceeeb795][0] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException:
[4e47da12d50c1f1bceeeb795][0] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService
$1.run(IndexShardGatewayService.java:229)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
1110)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by:
org.elasticsearch.index.engine.EngineCreationFailureException:
[4e47da12d50c1f1bceeeb795][0] Failed to create engine
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:
251)
at

org.elasticsearch.index.shard.service.InternalIndexShard.start(InternalIndexShard.java:
254)
at

org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:
146)
at org.elasticsearch.index.gateway.IndexShardGatewayService
$1.run(IndexShardGatewayService.java:179)
... 3 more
Caused by: java.io.FileNotFoundException:
/var/lib/elasticsearch/test/
nodes/0/indices/4e47da12d50c1f1bceeeb795/0/index/segments_1 (Too many
open files)
at java.io.RandomAccessFile.open(Native Method)
at
java.io.RandomAccessFile.(RandomAccessFile.java:233)
at
org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput
$Descriptor.(SimpleFSDirectory.java:69)
at org.apache.lucene.store.SimpleFSDirectory
$SimpleFSIndexInput.(SimpleFSDirectory.java:90)
at org.apache.lucene.store.NIOFSDirectory
$NIOFSIndexInput.(NIOFSDirectory.java:91)
at
org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:
78)
at
org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:
345)
at org.elasticsearch.index.store.support.AbstractStore
$StoreDirectory.openInput(AbstractStore.java:356)
at
org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:262)
at
org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:
359)
at org.apache.lucene.index.SegmentInfos
$FindSegmentsFile.run(SegmentInfos.java:750)
at org.apache.lucene.index.SegmentInfos
$FindSegmentsFile.run(SegmentInfos.java:589)
at
org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:355)
at
org.apache.lucene.index.IndexWriter.(IndexWriter.java:1144)
at

org.elasticsearch.index.engine.robin.RobinEngine.createWriter(RobinEngine.java:
1242)
at
org.elasticsearch.index.engine.robin.RobinEngine.start(RobinEngine.java:
249)
... 6 more

   My configuration is as followed:

   # Cluster Settings
   cluster:
     name: search

   # Server Address
   #network :
   #   host : 10.0.0.4

   # Paths
   #path:
   #  logs: /var/log/elasticsearch
   #  data: /var/data/elasticsearch

   # Gateway Settings
   #gateway:
   #  recover_after_nodes: 1
   #  recover_after_time: 5m
   #  expected_nodes: 2

   # Index Settings
   index :
     number_of_shards : 2
     number_of_replicas : 1
     analysis :
       analyzer :
         filename :
           tokenizer : letter
           filter : [standard, lowercase, autocomplete]
         default :
           tokenizer : standard
           filter : [standard, lowercase, autocomplete]
       filter :
         autocomplete :
           type : edgeNGram
           min_gram : 3
           max_gram : 15
           side : front

I also started elasticsearch from my init.d script with ulimit -n
20000 to limit it to have max. 20000 files open.

start() {
echo -n $"Starting ${NAME}: "
ulimit -n 20000

}

Many Thanks,

Michael


(mkleen) #3

On 15 August 2011 12:58, Shay Banon kimchy@gmail.com wrote:

Which version are you using? The first thing that I would do, is check what
the actual number of open files is for the process, the nodes info API
provides that in 0.17.

I'm using 0.17.2 at the moment. The Node Info API gives me the
following feedback back: It just gives me the maximum number of files
back, but actually not the current one, or do I call the wrong url ?

curl -XGET 'http://localhost:9200/_cluster/nodes'

{
"cluster_name":"search",
"nodes":{
"jbSkF0O3THqAo0i-h0YYdw":{
"name":"Lin Sun",
"transport_address":"inet[/10.58.199.249:9300]",
"attributes":{

     },
     "http_address":"inet[/10.58.199.249:9200]",
     "os":{
        "refresh_interval":5000
     },
     "process":{
        "refresh_interval":5000,
        "id":20531,
        "max_file_descriptors":20000
     },
     "jvm":{
        "pid":20531,
        "version":"1.6.0_20",
        "vm_name":"OpenJDK 64-Bit Server VM",
        "vm_version":"19.0-b09",
        "vm_vendor":"Sun Microsystems Inc.",
        "start_time":1313402828181,
        "mem":{
           "heap_init":"256mb",
           "heap_init_in_bytes":268435456,
           "heap_max":"1019.8mb",
           "heap_max_in_bytes":1069416448,
           "non_heap_init":"23.1mb",
           "non_heap_init_in_bytes":24313856,
           "non_heap_max":"214mb",
           "non_heap_max_in_bytes":224395264
        }
     },
     "network":{
        "refresh_interval":5000
     },
     "transport":{
        "bound_address":"inet[/0:0:0:0:0:0:0:0:9300]",
        "publish_address":"inet[/10.58.199.249:9300]"
     }
  }

}
}


(Shay Banon) #4

To get the current number of open files, use the node stats API. The idea is
that the node info API gives you the static information of the node, and
stats gives you the values and statistics that can change.

On Mon, Aug 15, 2011 at 3:32 PM, Michael Kleen mkleen@gmail.com wrote:

On 15 August 2011 12:58, Shay Banon kimchy@gmail.com wrote:

Which version are you using? The first thing that I would do, is check
what
the actual number of open files is for the process, the nodes info API
provides that in 0.17.

I'm using 0.17.2 at the moment. The Node Info API gives me the
following feedback back: It just gives me the maximum number of files
back, but actually not the current one, or do I call the wrong url ?

curl -XGET 'http://localhost:9200/_cluster/nodes'

{
"cluster_name":"search",
"nodes":{
"jbSkF0O3THqAo0i-h0YYdw":{
"name":"Lin Sun",
"transport_address":"inet[/10.58.199.249:9300]",
"attributes":{

    },
    "http_address":"inet[/10.58.199.249:9200]",
    "os":{
       "refresh_interval":5000
    },
    "process":{
       "refresh_interval":5000,
       "id":20531,
       "max_file_descriptors":20000
    },
    "jvm":{
       "pid":20531,
       "version":"1.6.0_20",
       "vm_name":"OpenJDK 64-Bit Server VM",
       "vm_version":"19.0-b09",
       "vm_vendor":"Sun Microsystems Inc.",
       "start_time":1313402828181,
       "mem":{
          "heap_init":"256mb",
          "heap_init_in_bytes":268435456,
          "heap_max":"1019.8mb",
          "heap_max_in_bytes":1069416448,
          "non_heap_init":"23.1mb",
          "non_heap_init_in_bytes":24313856,
          "non_heap_max":"214mb",
          "non_heap_max_in_bytes":224395264
       }
    },
    "network":{
       "refresh_interval":5000
    },
    "transport":{
       "bound_address":"inet[/0:0:0:0:0:0:0:0:9300]",
       "publish_address":"inet[/10.58.199.249:9300]"
    }
 }

}
}


(mkleen) #5

Ok, great. So would be a reasonable number of files open ? The last
time i had more than 10.000 files open on a single instance ? Should i
use several instances instead ?

On 15 August 2011 17:53, Shay Banon kimchy@gmail.com wrote:

To get the current number of open files, use the node stats API. The idea is
that the node info API gives you the static information of the node, and
stats gives you the values and statistics that can change.


(Shay Banon) #6

How many shards do you have on that single instance? It depends mainly on
that (each shard is a lucene index).

On Mon, Aug 15, 2011 at 9:11 PM, Michael Kleen mkleen@gmail.com wrote:

Ok, great. So would be a reasonable number of files open ? The last
time i had more than 10.000 files open on a single instance ? Should i
use several instances instead ?

On 15 August 2011 17:53, Shay Banon kimchy@gmail.com wrote:

To get the current number of open files, use the node stats API. The idea
is
that the node info API gives you the static information of the node, and
stats gives you the values and statistics that can change.


(system) #7