Masters are ramping up GC times, using more heap every minute

xrl · May 2, 2018, 2:54pm

I am running a cluster with 20 9TB data nodes (32GB heap), 3 client nodes (24GB heap), and 3 master nodes (20GB heap). This cluster has been running stable under monitoring for the past 90 days. Yesterday, nowhere near a UTC rollover event which would create new daily indices, we started seeing ramped up GC time on the masters. The GCs are more frequent and taking longer, here's a graph:

The two saw-tooths are from my manual restart of the master. The newly elected masters show the same memory growth.

Here is the time spent in GC:

What can I do to debug this issue? In the past it has been a problem with out-of-control schema growth from poorly processed data. But here I'm not sure, how can I track down how memory is being used in the master?

xrl · May 2, 2018, 7:53pm

Turns out one of our analysts was logging bro data directly to the cluster and had turned it on right at the time we saw this issue start. Disabling their data flow got the master under control again.

We are still trying to figure out what is causing the bro IDS data to induce master GC pressure.

Glen_Smith · May 5, 2018, 4:49am

I've seen this kind of thing when there is a field explosion due to inadvisably structured documents being indexed, i.e. something that should be a field value is in the documents as a field name, causing a high rate of dynamic addition of fields to your mapping. A particular tell for this (in addition to looking at the mapping and seeing thousands of fields) is that heap usage is extraordinarily high even on the unelected masters.

system · June 2, 2018, 4:49am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Continous GC on Master Node Elasticsearch	7	859	October 4, 2018
Huge documents - are these to blame for our Young GC problem? Elasticsearch	3	774	August 21, 2017
Heap / GC Issues Elasticsearch	9	480	July 6, 2017
GC out of control on 12 node elastic 5.4 cluster Elasticsearch	1	479	August 7, 2017
Master node not garbage collecting Elasticsearch	2	545	July 5, 2017

Masters are ramping up GC times, using more heap every minute

Related topics