Young GC inconsistent durations

quid_ryan · December 8, 2016, 11:36pm

Hi there.

Was wondering if anybody has run into similar behavior to the following:

Running some performance benchmarks against an isolated cluster. The cluster consists of 3 ES client nodes and 4 ES data nodes sized with: 64GB RAM (16GB for ES heap); 1TB SSD; 8 cores.

Data has been loaded to replicate our production environment, disks about 50% filled, replicating the shard/core ratio (1.5 shards to cores). We are using ES 2.3.3 with doc values.

The behavior we are seeing is that for a single user querying the cluster in 5 seconds intervals, produces more than acceptable response time, however the 98th% percentile there will be 10x spikes in terms of response times. For instance 100ms response time to 1.5s. We have been able to correlate these spikes to GC via the logs.

[2016-12-08 15:26:26,655][WARN ][monitor.jvm ] [machine] [gc][young][743843][6109] duration [3s], collections [1]/[3.1s], total [3s]/[1.2h], memory [7.5gb]->[7gb]/[14.6gb], all_pools {[young] [532.5mb]->[167.6kb]/[532.5mb]}{[survivor] [10.1mb]->[8.6mb]/[66.5mb]}{[old] [7gb]->[7gb]/[14gb]}

It appears that young GC is taking a bit longer than expected? Reading a bit more, could this potentially be due to the fact that searching the references from the old to new space takes a long time?

Was wondering if there are any suggestions to avoid these inconsistent young GC spikes as they spike our search latency. Could we be giving too much heap? Was potentially reading that maybe we are GC'ing to often and objects are being moved to the old space too quickly. Could this potentially lead to longer GC times while the old to young references are updated?

Just curious if anyone has seen similar.

Thanks for the input.

Couple additional notes:

Default java settings
Basic term and match searches over a range.
mlockall enabled
swap off
read heavy workload

Ryan

system · January 5, 2017, 11:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Strange young GC pauses with ES 1.3 and Java 8 Elasticsearch	5	1722	July 5, 2017
ES v1.1 continuous young gc pauses old gc, stops the world when old gc happens and splits cluster Elasticsearch	15	1315	July 6, 2017
Huge documents - are these to blame for our Young GC problem? Elasticsearch	3	774	August 21, 2017
ElasticSearch gc performance on cluster Elasticsearch	3	651	July 5, 2017
Long GC pauses with ES 1.3.4 Elasticsearch	12	1502	July 5, 2017

Young GC inconsistent durations

Related topics