Elasticsearch 2.3.3 encountered outofmemory

tuankun · July 25, 2016, 1:55am

Hi,
we run 4 elk nodes in cluter using jre 1.8.0.77,logstash 2.3.2,elasticsearch 2.3.2 ,kibana 4.5.1.and there are totally about more than three hundred client server transfering linux log,windows event log and iis log to ELK cluster.
the following is our architecture and H/W configuration:

but now we encoutered a critial error, logstash on ELK01 host is receiving and processing logs from client servers. every few days,the elasticsearch.log will log "OutOfMemory" exception,and quit from cluster,at the same time, I can not login OS through ssh remotely,so I have to force to reboot OS.
Could anybody help to fix it? thanks.

[2016-07-25 01:22:35,370][WARN ][index.engine ] [elk04] [it_p5sfcs_iislog-2016.07.24][0] failed engine [refresh failed]
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Unknown Source)
at org.apache.lucene.index.ConcurrentMergeScheduler.merge(ConcurrentMergeScheduler.java:517)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1931)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:455)
at org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:286)
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:261)
at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:251)
at org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:104)
at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:137)
at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:154)
at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
at org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
at org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
at org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:672)
at org.elasticsearch.index.shard.IndexShard.refresh(IndexShard.java:661)
at org.elasticsearch.index.shard.IndexShard$EngineRefresher$1.run(IndexShard.java:1349)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

warkolm · July 25, 2016, 1:59am

How much data in your cluster?

tuankun · July 25, 2016, 2:09am

@warkolm
we configured number_of_replicas is 1, so totally is 50GB per day,

warkolm · July 25, 2016, 2:10am

Ok, how many all together though?
How many indices, how many shards?

tuankun · July 25, 2016, 2:16am

@warkolm
every day will create 25 indices(each index owns 5 primary shards and 5 replicas shards),and keep past one month history indices.
so now ,we have 787 indices and 7872 shards in our cluster.

warkolm · July 25, 2016, 2:17am

I'd say you are massively oversharded and that is creating heap pressure.

tuankun · July 25, 2016, 2:22am

@warkolm
based our ELK cluster H/W,Could you have the advice about how many indices and shards?
or How can I know the optimum amount values of indices and shards.

ELK04 : HP DL580 Gen5
ELK01/02/03: HP DL380 Gen5

warkolm · July 25, 2016, 3:07am

Aim for shard size <50GB.

tuankun · July 25, 2016, 3:16am

@warkolm

is it right that one index(5 primary shards+5 replicas shards) size should be less than 500GB?
May I configure logstash output plugins to create index by week?,if yes,could you give me the date pattern?

DiscussBuster · July 25, 2016, 3:47am

Sounds about right.

Consult the documentation

tuankun · July 25, 2016, 8:55am

@warkolm @DiscussBuster
Really very thank you ,I will try to modify logstash configuration to create indices by week to reduce the amount of shards,then check the effect.

Topic		Replies	Views
OutOfMemoryError ELK 5.0.2 Elasticsearch	11	2929	January 18, 2017
Out of Memory Error Elasticsearch	11	3026	July 5, 2017
Crash after a few days Elasticsearch	5	1459	October 17, 2017
Elasticsearch client is giving OutOfMemoryError once connection is lost to elaticsearch server Elasticsearch	6	1063	July 6, 2017
Logstash hangs with OutOfMemoryError Logstash	7	2752	July 6, 2017

Elasticsearch 2.3.3 encountered outofmemory

Related topics