How to rightsize HEAP , long GC , ES 2.4


(Kim Kruse Hansen) #1

Hi There

I have recently deployed 2 separate ES 2.4 Clusters , 2 nodes in each cluster , running on Windows 2012 R2 standard , Virtual Machines on vmware , 12 Cores , 24 GB RAM , 1 TB disk. I am using Java 1.8 , update 102.

The clusters are index heavy , meaning , they are receiving around 300-400 million entries per day. Logs are being sent from Filebeat to Logstash , which sends the logs to ES. Indexing rate is around 3000-7000 a second.

I have reduced the shard count from 5 to 2 in my index and I am creating a new index every day.

My main problem is what is the correct approach to determine the correct ES heap size ??

I started out with heap size of around 8 GB. That let the cluster run for approximately 1 day and then it ran into a long GC old , which took several minutes. And I experience error message , that master has left.

I am now at a heap size of 16 GB , which lets it run for a few days , before again being hit by a long gc old. This night , I saw a GC old taking a whopping 42 min. I see a heap usage slowly increasing over time. Like some kind of memory leak.

So then I tried to make a script to monitor the heap and when it reaches 90% or more , I will gracefully restart the node , but I dont think this is the correct approach.

I would like to think , that ES is mature enough to not suffer from this problem.

So my question , what is the correct way of sizing the heap to avoid long gc ? What other settings , should I be concerned about in a cluster being index heavy ? I have done a lot of googling ofc , but I need a little help :wink:

We would like to add more logs into the cluster and add more nodes , but I need the clusters to be more stable before this is happening .

Regards


(Mark Walkom) #2

How much data in the cluster, how many indices, how many shards.
What does your mapping look like, do you have lots of fielddata taking up heap?


(Kim Kruse Hansen) #3

Hi Mark

I have 2 daily indices. I have one index for my dnslogs and one index for winlogbeat , aka windows eventlogs.

The dns index is around 300 million per day , 100 GB , 2 shards , 1 replica , the winlogbeat is around 25 million , 40 GB , 5 shards , 1 replica.

I prune all indexes older than 8 days.

If I monitor with marvel , fielddata graph is consistently 0.

Regards


(Mark Walkom) #4

What sort of queries are you doing against it?


(Kim Kruse Hansen) #5

Hi

Just very simple queries through kibana 4.6. Nothing fancy.

Like who queried a given dns domain. What computers have a given event_id ? very basic stuff.

And very few queries through the day. All queries respond fine and smooth.

Regards


(Christian Dahlqvist) #6

What does your node configuration look like?


(Kim Kruse Hansen) #7

Hi

What exactly would you like to know ?

Regards


(Christian Dahlqvist) #8

Anything that is not default.


(Kim Kruse Hansen) #9

Anything that is not default ...

Well , I have changed logstash parameters to increase indexing performance.

-w 24 -b 3048

workers=24 , batchsize = 3048

I have increased output parameters in logstash.json

workers = 2, flush_size = 3048

These are settings from elasticsearch.yml

Indexing Settings for Writes

index.refresh_interval: 30s

path.data: d:\data
path.logs: c:\logs

cluster.routing.allocation.cluster_concurrent_rebalance: 4
cluster.routing.allocation.node_initial_primaries_recoveries: 4
cluster.routing.allocation.node_concurrent_recoveries: 6
indices.recovery.max_bytes_per_sec: 125mb
indices.recovery.concurrent_streams: 5
indices.memory.index_buffer_size: 30%


(Kim Kruse Hansen) #10

Hi

We had another long GC , which caused the Cluster to fail.

[2016-09-19 10:24:52,142][WARN ][monitor.jvm ] [node1] [gc][young][111431][3834] duration [2.9s], collections [1]/[3.5s], total [2.9s]/[3.2m], memory [13.7gb]->[9.3gb]/[19.3gb], all_pools {[young] [4.9gb]->[108.1mb]/[5.3gb]}{[survivor] [606.8mb]->[682.6mb]/[682.6mb]}{[old] [8.1gb]->[8.6gb]/[13.3gb]}
[2016-09-19 10:25:04,680][INFO ][monitor.jvm ] [node1] [gc][young][111443][3835] duration [908ms], collections [1]/[1.2s], total [908ms]/[3.2m], memory [14.5gb]->[9.7gb]/[19.3gb], all_pools {[young] [5.2gb]->[113mb]/[5.3gb]}{[survivor] [682.6mb]->[682.6mb]/[682.6mb]}{[old] [8.6gb]->[8.9gb]/[13.3gb]}
[2016-09-19 10:25:21,951][WARN ][monitor.jvm ] [node1] [gc][young][111457][3836] duration [2.9s], collections [1]/[4s], total [2.9s]/[3.3m], memory [14.2gb]->[10.5gb]/[19.3gb], all_pools {[young] [4.5gb]->[87.1mb]/[5.3gb]}{[survivor] [682.6mb]->[682.6mb]/[682.6mb]}{[old] [8.9gb]->[9.8gb]/[13.3gb]}
[2016-09-19 10:25:35,434][WARN ][monitor.jvm ] [node1] [gc][young][111469][3837] duration [1.8s], collections [1]/[2.3s], total [1.8s]/[3.3m], memory [15.5gb]->[11gb]/[19.3gb], all_pools {[young] [5gb]->[41.2mb]/[5.3gb]}{[survivor] [682.6mb]->[682.6mb]/[682.6mb]}{[old] [9.8gb]->[10.3gb]/[13.3gb]}
[2016-09-19 10:25:49,194][WARN ][monitor.jvm ] [node1] [gc][young][111481][3838] duration [2.3s], collections [1]/[2.5s], total [2.3s]/[3.3m], memory [16.2gb]->[12gb]/[19.3gb], all_pools {[young] [5.2gb]->[34.1mb]/[5.3gb]}{[survivor] [682.6mb]->[682.6mb]/[682.6mb]}{[old] [10.3gb]->[11.3gb]/[13.3gb]}
[2016-09-19 10:26:01,727][WARN ][monitor.jvm ] [node1] [gc][young][111491][3839] duration [2.2s], collections [1]/[3.1s], total [2.2s]/[3.4m], memory [16.4gb]->[12.6gb]/[19.3gb], all_pools {[young] [4.4gb]->[35mb]/[5.3gb]}{[survivor] [682.6mb]->[682.6mb]/[682.6mb]}{[old] [11.3gb]->[11.9gb]/[13.3gb]}
[2016-09-19 10:26:13,914][INFO ][monitor.jvm ] [node1] [gc][young][111503][3840] duration [762ms], collections [1]/[1s], total [762ms]/[3.4m], memory [17.8gb]->[12.9gb]/[19.3gb], all_pools {[young] [5.1gb]->[33.4mb]/[5.3gb]}{[survivor] [682.6mb]->[682.6mb]/[682.6mb]}{[old] [11.9gb]->[12.2gb]/[13.3gb]}
[2016-09-19 10:26:27,011][WARN ][monitor.jvm ] [node1] [gc][young][111515][3841] duration [1.1s], collections [1]/[1.9s], total [1.1s]/[3.4m], memory [17.7gb]->[13.2gb]/[19.3gb], all_pools {[young] [4.7gb]->[2.8mb]/[5.3gb]}{[survivor] [682.6mb]->[655.2mb]/[682.6mb]}{[old] [12.2gb]->[12.5gb]/[13.3gb]}
[2016-09-19 10:32:45,761][WARN ][monitor.jvm ] [node1] [gc][old][111536][3] duration [5.9m], collections [1]/[5.9m], total [5.9m]/[5.9m], memory [18.5gb]->[5gb]/[19.3gb], all_pools {[young] [5.2gb]->[108.8mb]/[5.3gb]}{[survivor] [655.2mb]->[0b]/[682.6mb]}{[old] [12.5gb]->[4.9gb]/[13.3gb]}

this time it took 5.9 minutes

Regards


(Christian Dahlqvist) #11

That is indeed a very long GC. Do you perhaps have swap enabled or is the host overcommitted with respect to memory?


(Kim Kruse Hansen) #12

Hi

The same behaviour occurs on 2 separate ES clusters.

The first cluster has 24 GB RAM on each node and second cluster has 32 GB RAM.

I have changed the configuration today.

  1. bootstrap.mlockall: true
  2. Memory reservation on VMWare

So need to wait till I see the effect of these settings. If any ...

I can provide a JVM heap dump , if anyone wants to take a look.

Regards


(Mark Walkom) #13

Having ballooning memory is bad.


(Kim Kruse Hansen) #14

Well , I found the root cause for this problem.

It turned to be the Garbage Collector used in ES 2.4.0. I installed the new 2.4.1 update of ES on my clusters and finally I experience the correct sawtooth pattern in heap usage. Climbs to 75% and then drops like expected.

I have to admit that this bug is so bad and so embarrasing for the quality of ES on Windows , that I am very surprised , that this can go unnoticed for so long.

I would like some comment from ES on how such a fatal bug is able to pass your quality test ? Dont you have any Windows Clusters running ES ? Why is ES able to run at all with the serial garbage collector , without a giant warning in the logs ? Better yet , it should refuse to run at all , with the serial garbage collector enabled.

Regards


(Clinton Gormley) #15

Why is ES able to run at all with the serial garbage collector , without a giant warning in the logs ? Better yet , it should refuse to run at all , with the serial garbage collector enabled.

We agree. See https://github.com/elastic/elasticsearch/pull/20558

I would like some comment from ES on how such a fatal bug is able to pass your quality test ?

Oversight. If you configure a particular garbage collector in your application, do you have a check to ensure that the JVM is using that GC collector? Probably not. Turns out we should. Live and learn


(system) #16