Stop-the-world slow GC's all the time [Production]

(Dor Rotman) #1

I have a single server with 32 GB RAM and 24 cores.
ElasticSearch has 16GB out of it, set in ES_HEAP_SIZE.
Total index size is ~200 GB. Not using doc values.

ElasticSearch 1.7
Java version:
openjdk version "1.8.0_66-internal" OpenJDK Runtime Environment (build 1.8.0_66-internal-b17) OpenJDK 64-Bit Server VM (build 25.66-b17, mixed mode)

Upon checking, I see that the fielddata cache size is 4 GB and filter cache size is 1.2 GB.
So where do the rest of the 10.8 GB go to?

I'm getting these GC logs:
[2016-03-06 16:38:56,632][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][6317][367] duration [23.5s], collections [2]/[24.1s], total [23.5s]/[1.5h], memory [15.7gb]->[14.7gb]/[15.8gb], all_pools {[young] [1.1gb]->[176.2mb]/[1.1gb]}{[survivor] [14.3mb]->[0b]/[149.7mb]}{[old] [14.5gb]->[14.5gb]/[14.5gb]} [2016-03-06 16:39:44,043][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][6343][369] duration [22.2s], collections [1]/[22.3s], total [22.2s]/[1.5h], memory [15.7gb]->[14.7gb]/[15.8gb], all_pools {[young] [1.1gb]->[184.4mb]/[1.1gb]}{[survivor] [87mb]->[0b]/[149.7mb]}{[old] [14.5gb]->[14.5gb]/[14.5gb]} [2016-03-06 16:40:31,584][WARN ][monitor.jvm ] [Scarlet Witch] [gc][old][6369][371] duration [22.4s], collections [1]/[22.4s], total [22.4s]/[1.5h], memory [15.8gb]->[14.7gb]/[15.8gb], all_pools {[young] [1.1gb]->[188.5mb]/[1.1gb]}{[survivor] [136.4mb]->[0b]/[149.7mb]}{[old] [14.5gb]->[14.5gb]/[14.5gb]}
And many more, repeating.
It seems the Garbage Collector can't get rid of some things there.
What are they?
How can I prevent this?


(Mark Walkom) #2

If you really want to know then take a heap dump and analyse it.
Do you have Marvel installed on this?

(Dor Rotman) #3

I'm not sure a heap dump would help me as I'm not familiar with the internals.
I do have Marvel installed.
Is there any specific metric that could point to the cause of the GCs?

Here's a link: PDF printout from Marvel


(system) #4