So. we pretty much wiped the cluster clean of data.
The heap is still running away:
This is always happening on the master node only.
[13:19:30][root]$ uname -a
Linux host-2 4.14.67-66.56.amzn1.x86_64 #1 SMP Tue Sep 4 22:03:21 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[16:48:32][root]$ java -version
java version "10.0.2" 2018-07-17
Java(TM) SE Runtime Environment 18.3 (build 10.0.2+13)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
both machines have 16GB of ram, is a compute-optimized EC2 instance class of c4.2xlarge
and here's the JVM config
[16:49:40][root]$ cat /etc/elasticsearch/jvm.options
-Xms8g
-Xmx8g
# GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
## optimizations
# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch
## basic
# explicitly set the stack size
-Xss1m
# set to headless, just in case
-Djava.awt.headless=true
# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8
# use our provided JNA always versus the system one
-Djna.nosys=true
# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow
# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Djava.io.tmpdir=${ES_TMPDIR}
## heap dumps
# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError
# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=/var/lib/elasticsearch
# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=/var/log/elasticsearch/hs_err_pid%p.log
## JDK 8 GC logging
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/var/log/elasticsearch/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/var/log/elasticsearch/gc.log:utctime,pid,tags:filecount=32,filesize=64m
9-:-Djava.locale.providers=COMPAT
10-:-XX:UseAVX=2
and here's the es config
# ---------------------------------- Cluster -----------------------------------
#
cluster.name: elk-elasticsearch
#
# ------------------------------------ Node ------------------------------------
#
node.name: 172.32.9.177
node.max_local_storage_nodes: 1
#
# ----------------------------------- Paths ------------------------------------
#
#path.conf: /etc/elasticsearch
path.data: /data/elk-elasticsearch
path.logs: /logs
#
# ---------------------------------- Network -----------------------------------
#
network.host: ["172.32.9.177", localhost]
http.port: 9200
#
# --------------------------------- Discovery ----------------------------------
#
discovery.zen.ping.unicast.hosts: ["172.32.9.177","172.32.8.40"]
discovery.zen.minimum_master_nodes: 1
#
# ---------------------------------- Various -----------------------------------
#
action.destructive_requires_name: true
#
# -------------------------- Custom Chef Configuration --------------------------
gateway.expected_nodes: 0
transport.tcp.port: 9300
xpack.monitoring.enabled: false
while i understand that "more heap is better" a cluster with only 124 total shards (68 primaries) , 28GB net data and ~ 38M documents should not be killing a 16gb combined heap setup.
and i cannot stress this point enough, 6.2.3 did not have this issue. only after upgrade to 6.4.2 did this start. On 6.2.3 I was running close to 800GB of data and around 4000 shards day to day. And while performance wasnt super-zippy, it was stable.
I have tried various GC combinations, more ram assigned to heap from the OS (8/9/10), less ram, lower initialoccupancyfraction (50,60,70) everything leads to the same problem that can be displayed like this:
master node GC seems to leak memory somewhere in the heap, until it cant GC anymore, at which point old GC revs up like a lion, within about 2 hours gc time goes from 400ms out of 1s to about 28s out of 30s intervals, and then the cluster eventually runs outta heap, because OLD count never ever goes down.