Please, help me tune java GC parameters

Brad_Jungsu_Heo · November 23, 2016, 4:56am

Hi.

I have a trouble with long java gc Stop-The-World during heavy bulk load.

my environment

ES Version: 2.3.2
Heap Size: 20GB
Total Ram: 64GB
Java Version: 1.8.0_60 Java HotSpot(TM) 64-Bit
OS: CentOS 7.2

gc log with default parameters

At the initial deployment of my cluster, I had experienced stw like this:

[date][WARN ][monitor.jvm              ]
[hostname]
    [gc][old][811366][210]
    duration [10s],
    collections [1]/[11.3s],
    total [10s]/[1.1m],
    memory [14.5gb]->[2.7gb]/[19.8gb],
    all_pools 
        {[young] [1gb]->[25.9mb]/[1.4gb]}
        {[survivor] [184.5mb]->[0b]/[191.3mb]}
        {[old] [13.2gb]->[2.7gb]/[18.1gb]}

Parameters tuned

So, I've tuned gc parameters.

young and survivor area increased to prevent from new object are moved to old generation.

    -XX:MaxTenuringThreshold=15
    -XX:NewRatio=6
    -XX:SurvivorRatio=3
    -XX:-UseAdaptiveSizePolicy

after several months later

Full gc happens after tuning but less frequently. It was fine for several months. But earlier of this month, I got longer stw than initial gc parameters.

[date][WARN ][monitor.jvm              ]
[hostname]
    [gc][old][2894376][121395]
    duration [28.7s],
    collections [1]/[30s],
    total [28.7s]/[4.4h],
    memory [15.8gb]->[14.6gb]/[19.4gb],
    all_pools
        {[young] [498.1mb]->[83.3mb]/[1.7gb]}
        {[survivor] [394.6mb]->[0b]/[585.1mb]}
        {[old] [15gb]->[14.5gb]/[17.1gb]}

After gc, only 0.5gb of old generation was freed. this long stw happens again every 5~7 days. The only thing I can do is Rolling Restart

It would be appreciated it if anyone can suggest gc parameters or help me out!

Thanks in advance

warkolm · November 23, 2016, 8:55am

We do not recommend tuning java parameters.

If you are constantly hitting GC then your cluster is overloaded, and any tweaking you do is of diminishing returns.

Brad_Jungsu_Heo · November 23, 2016, 9:29am

@warkolm

Thank you for your reply!

After reading your reply, I've read Don’t Touch These Settings!.

There is "Do not change the default garbage collector!" but there is no "We do not recommend tuning java parameters."

I'd like to change CMS gc's parameters. Is this not recommended? and Could I read some related articles?

Thank you.

Christian_Dahlqvist · November 23, 2016, 9:35am

What does you Elasticsearch config look like? Do you have scripting enabled?

Ravi_Shanker_Reddy · November 23, 2016, 9:35am

https://www.elastic.co/guide/en/elasticsearch/guide/current/_java_virtual_machine.html

Just FYI. In this they mentioned Clearly not to tweak any JVM settings.

Brad_Jungsu_Heo · November 23, 2016, 9:40am

@Christian_Dahlqvist

Yes. scripting is enabled and approximately 30m documents are updated every day. I'm wondering scripting and updates are related with full gc.

Thanks.

Brad_Jungsu_Heo · November 23, 2016, 9:41am

@Ravi_Shanker_Reddy

Oh. I got it. I'll read it carefully.

Thank you.

Christian_Dahlqvist · November 23, 2016, 9:47am

According to this post there may be a memory leak in Groovy, which might be causing the issues you are seeing.

Brad_Jungsu_Heo · November 23, 2016, 9:54am

@Christian_Dahlqvist

Great! I didn't know that.

Thank you so much.

Christian_Dahlqvist · November 23, 2016, 9:56am

It might be worthwhile looking into migrating to Elasticsearch 5.0 and the new Painless scripting language.

Brad_Jungsu_Heo · November 24, 2016, 4:33am

@Christian_Dahlqvist

Thank you for your consideration.

But I can't use ES 5.0 at this moment because I use CDH 5.8 (which uses jdk 1.7), es-hadoop, and Spark (ES 5.0's es-hadoop only supports java 1.8)

and with regars to memory leaks of groovy script, I execute a same script for every request, like this:

val upsertScript = """
    ...
    if (ctx._source.containsKey("f1")) {
        ctx._source.f1 += param1;
    }
    else {
        ctx._source.f1 = param1;
    }
    ...
"""

conf.set("es.update.script", upsertScript);
conf.set("es.update.script.params", "param1:v1,param2:v2,param3:v3")

So, I think my deployment does not relate to memory-leak.

Anyway, I appreciate you reply! Thanks.

jprante · November 24, 2016, 6:20pm

The memory leak was fixed by the Groovy team https://issues.apache.org/jira/browse/GROOVY-7683 so you should watch closely when Groovy 2.4.8 will be released. Then, replace existing groovy jar in ES to the new 2.4.8 groovy jar.

system · December 22, 2016, 6:20pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Very long gc pause caused by ES Elasticsearch	6	1035	July 5, 2017
Very long GC Elasticsearch	11	6769	July 6, 2017
Miracle G1 settings for 30GB heaps Elasticsearch	15	16590	July 5, 2017
Long GC pauses with ES 1.3.4 Elasticsearch	12	1502	July 5, 2017
ElasticSearch gc performance on cluster Elasticsearch	3	651	July 5, 2017

Please, help me tune java GC parameters

Related topics