Hi.
I have a trouble with long java gc Stop-The-World during heavy bulk load.
my environment
ES Version: 2.3.2
Heap Size: 20GB
Total Ram: 64GB
Java Version: 1.8.0_60 Java HotSpot(TM) 64-Bit
OS: CentOS 7.2
gc log with default parameters
At the initial deployment of my cluster, I had experienced stw like this:
[date][WARN ][monitor.jvm ]
[hostname]
[gc][old][811366][210]
duration [10s],
collections [1]/[11.3s],
total [10s]/[1.1m],
memory [14.5gb]->[2.7gb]/[19.8gb],
all_pools
{[young] [1gb]->[25.9mb]/[1.4gb]}
{[survivor] [184.5mb]->[0b]/[191.3mb]}
{[old] [13.2gb]->[2.7gb]/[18.1gb]}
Parameters tuned
So, I've tuned gc parameters.
young and survivor area increased to prevent from new object are moved to old generation.
-XX:MaxTenuringThreshold=15
-XX:NewRatio=6
-XX:SurvivorRatio=3
-XX:-UseAdaptiveSizePolicy
after several months later
Full gc happens after tuning but less frequently. It was fine for several months. But earlier of this month, I got longer stw than initial gc parameters.
[date][WARN ][monitor.jvm ]
[hostname]
[gc][old][2894376][121395]
duration [28.7s],
collections [1]/[30s],
total [28.7s]/[4.4h],
memory [15.8gb]->[14.6gb]/[19.4gb],
all_pools
{[young] [498.1mb]->[83.3mb]/[1.7gb]}
{[survivor] [394.6mb]->[0b]/[585.1mb]}
{[old] [15gb]->[14.5gb]/[17.1gb]}
After gc, only 0.5gb of old generation was freed. this long stw happens again every 5~7 days. The only thing I can do is Rolling Restart
It would be appreciated it if anyone can suggest gc parameters or help me out!
Thanks in advance
warkolm
(Mark Walkom)
November 23, 2016, 8:55am
2
We do not recommend tuning java parameters.
If you are constantly hitting GC then your cluster is overloaded, and any tweaking you do is of diminishing returns.
1 Like
@warkolm
Thank you for your reply!
After reading your reply, I've read Don’t Touch These Settings! .
There is "Do not change the default garbage collector!"
but there is no "We do not recommend tuning java parameters."
I'd like to change CMS gc's parameters. Is this not recommended? and Could I read some related articles?
Thank you.
What does you Elasticsearch config look like? Do you have scripting enabled?
@Christian_Dahlqvist
Yes. scripting is enabled and approximately 30m documents are updated every day. I'm wondering scripting and updates are related with full gc.
Thanks.
@Ravi_Shanker_Reddy
Oh. I got it. I'll read it carefully.
Thank you.
According to this post there may be a memory leak in Groovy, which might be causing the issues you are seeing.
@Christian_Dahlqvist
Great! I didn't know that.
Thank you so much.
It might be worthwhile looking into migrating to Elasticsearch 5.0 and the new Painless scripting language .
@Christian_Dahlqvist
Thank you for your consideration.
But I can't use ES 5.0 at this moment because I use CDH 5.8 (which uses jdk 1.7), es-hadoop, and Spark (ES 5.0's es-hadoop only supports java 1.8)
and with regars to memory leaks of groovy script, I execute a same script for every request, like this:
val upsertScript = """
...
if (ctx._source.containsKey("f1")) {
ctx._source.f1 += param1;
}
else {
ctx._source.f1 = param1;
}
...
"""
conf.set("es.update.script", upsertScript);
conf.set("es.update.script.params", "param1:v1,param2:v2,param3:v3")
So, I think my deployment does not relate to memory-leak.
Anyway, I appreciate you reply! Thanks.
jprante
(Jörg Prante)
November 24, 2016, 6:20pm
12
The memory leak was fixed by the Groovy team https://issues.apache.org/jira/browse/GROOVY-7683 so you should watch closely when Groovy 2.4.8 will be released. Then, replace existing groovy jar in ES to the new 2.4.8 groovy jar.
system
(system)
Closed
December 22, 2016, 6:20pm
13
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.