we upgraded our server from elasticsearch 5.6 to 6.8.6 and experience performance issues,, high cpu usage and frequent full GC.
the cluster is 10 data nodes with 300~ shards, we have a daily index shards size very from 500MB to 12GB depending on the index type. we have 1 primary shard and 1 replica per index.
we run on AWS using java 8 with CMS jvm configuration on a machine with 64GB RAM and 8 cores.
after the upgrade we encountered full GC every 10 minutes or so and the client nodes have high cpu usage and increase in network.
we investigate the issue and found this article https://e-mc2.net/elasticsearch-garbage-collection-hell and set the new ratio to 2 set the gc threads to 8, set the index refresh and memory index buffer to 20%.....the GC heap seems a better then before but there are still issues with the data nodes and client nodes, the frequent GC high cpu and network are still happening only after longer period of time.
anyone seen something like this?