We got an OOM error earlier this week and it looks like some resource other then memory may be the problem. These boxes have 64GB, 30GB allocated to java heap. At the time of the error they were all using around 30%-35% of the jvm's memory.
All nodes reported 65k limit on file descriptors before and after the error and cluster restart.
We are using the jdbc river plugin (yes for now anyway), which seems to be the event that triggers it. ( jdbc-1.5.0.5-da4ba96 1.5.0.5 )
Any clues, hints, suggestions are appreciated.
Thanks
-Doug
[2015-10-19 05:10:07,722][WARN ][index.engine ] [es4] [sales][1] failed engine [out of memory (source: [maybe_merge])]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:714)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:391)
at org.elasticsearch.index.merge.EnableMergeScheduler.merge(EnableMergeScheduler.java:50)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1985)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1979)
at org.elasticsearch.index.engine.InternalEngine.maybeMerge(InternalEngine.java:778)
at org.elasticsearch.index.shard.IndexShard$EngineMerger$1.run(IndexShard.java:1241)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
GET /_nodes/process:
{
"cluster_name": "es_prod",
"nodes": {
"uvieBMr3SXKu7BvKuWEJmQ": {
"name": "es4",
"version": "1.7.0",
"build": "929b973",
"http_address": "inet[/12.130.11.49:9200]",
"process": {
"refresh_interval_in_millis": 1000,
"id": 13702,
"max_file_descriptors": 65535,
"mlockall": true
}
},
"xnOltXOsS5eY7VAotvYiSg": {
"name": "es2",
"version": "1.7.0",
"build": "929b973",
"http_address": "inet[/12.130.11.47:9200]",
"process": {
"refresh_interval_in_millis": 1000,
"id": 43023,
"max_file_descriptors": 65535,
"mlockall": true
}
},
"imfqR95jSKOhErEN8nQk3w": {
"name": "es3",
"version": "1.7.0",
"build": "929b973",
"http_address": "inet[/12.130.11.48:9200]",
"process": {
"refresh_interval_in_millis": 1000,
"id": 26569,
"max_file_descriptors": 65535,
"mlockall": true
}
}
}
}