OutOfMemoryError on adequately sized cluster

javadevmtl · May 9, 2018, 3:20pm

Hi, running 6.2.4

Cluster topology:

3 masters
2 ingest nodes
1 Coordinator
3 Data nodes

The masters and the ingest nodes borked with OutOfMemoryError, the data nodes seemed to have survived.

Master (each): 16GB of which ES_JAVA_OPTS="-Xms7g -Xmx7g"
Ingest (each): 4GB of which ES_JAVA_OPTS="-Xms3g -Xmx3g"
Coordinator (each): 64GB of which ES_JAVA_OPTS="-Xms30g -Xmx30g"
Data (each): 64GB of which ES_JAVA_OPTS="-Xms30g -Xmx30g"

All nodes running ubuntu JDK
openjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-8u162-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode)

All nodes have
MAX_OPEN_FILES=65536
MAX_LOCKED_MEMORY=unlimited
MAX_MAP_COUNT=262144

Mast node before the crash:

Ingest node before the crash:

Master log: https://www.dropbox.com/s/vro7yls5mmfmu0u/master.log?dl=0
Ingest Log: https://www.dropbox.com/s/6trmcp5r0beulxa/ingest.log?dl=0

jpountz · May 9, 2018, 4:15pm

The fact that master and ingest nodes have the problem but not data nodes suggests that the issue might be related to the size of your cluster state. Maybe you have many indexes / shards / fields? What is the size of the output of GET /_cluster/state?

javadevmtl · May 9, 2018, 5:43pm

@jpountz

Hi, 800 indexes and 8000 shards give or take. The state is about 720KB

I should also add it's daily indexes, but most are small.

Out of the 800...

60 are between 1 to 3 million documents.
100 are between 100K to 900K documents.
The rest are bellow 100K documents.

warkolm · May 10, 2018, 6:11am

You have too many shards, look to use _shrink on older ones and reduce the count or switch to weekly/monthly indices.

javadevmtl · May 10, 2018, 6:37am

And if I want to keep that many say daily but maximum a year? Do I just increase the master ram to 30gb or add more nodes?

Can I have older indexes as monthly and the newer ones as daily? How will the date math work if we can do this?

warkolm · May 10, 2018, 7:15am

Given your index size, keeping a year's worth of indices (plus that last month) around isn't going to be worth keeping daily with that shard count.

javadevmtl · May 10, 2018, 7:33am

So can I take old ones make them monthly and new ones daily? Will Kibana date math work on both?

Christian_Dahlqvist · May 10, 2018, 7:36am

Switching from daily to monthly requires reindexing, so it is better if you switch to monthly indices for all data. Kibana does not use date math based on index names any longer, so that is not a problem.

javadevmtl · May 10, 2018, 8:01am

Ok. Thanks.

javadevmtl · May 10, 2018, 8:06am

Ok I will reindex but I have to phase it obviously. So I will do older ones first to monthly and then the newer ones eventually.

javadevmtl · May 10, 2018, 8:14am

On monthly indexes we cannot do daily backups though can we?

warkolm · May 10, 2018, 8:49am

Sure you can.

javadevmtl · May 10, 2018, 12:39pm

@warkolm

Cool so reading the docs... Snapshots only store the changed files correct? So if a monthly index has NOT changed in 31 days and we took 31 snapshots the snapshot repo would remain the same size and NOT have grown right?

warkolm · May 11, 2018, 5:12am

More than likely. You can reduce the chance by running a forge merge once they have been written to.

javadevmtl · May 14, 2018, 4:44pm

Hi, so far it seems stable, thanks! I thought people were running with much more indexes?

warkolm · May 15, 2018, 8:11am

Forget about the number of indices, it's the shards that matter.

javadevmtl · May 15, 2018, 3:04pm

Ok cool gotcha. So, 1 index with 5 shards is the same as 5 indexes with 1 shard each. And each shard is a Lucene index which takes up X amount of resources.

warkolm · May 16, 2018, 4:47am

Exactly.

system · June 13, 2018, 4:47am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch 2.3.3 encountered outofmemory Elasticsearch	11	1212	July 5, 2017
ES java.lang.OutOfMemoryError during Indexing? Elasticsearch	5	474	July 6, 2017
Elasticsearch (6.4.1) - JVM OutOfMemoryError Elasticsearch	5	1038	June 26, 2019
Guidance on increasing cluster.max_shards_per_node Elasticsearch	9	1690	August 13, 2020
Out of memory of data nodes Elasticsearch	5	1258	February 23, 2018

OutOfMemoryError on adequately sized cluster

Related topics