Anyone else seeing memory issues with java 121?

DigiAngel · January 29, 2017, 3:51pm

Topic...I had no issues with java 111, but with 121 I'm getting oom-killers Anyone else?

JKhondhu · January 29, 2017, 5:25pm

What issues exactly - Do we have a particular stack trace inline with a particular version?

DigiAngel · January 29, 2017, 6:16pm

Not sure what to post really...here's what I have via syslog (pastbin link due to size):

Java oom

Again, I've been running fine since October with java 111..so my guess is java is leaking somewhere. Thank you.

JKhondhu · January 29, 2017, 8:39pm

What version of Elasticsearch?

DigiAngel · January 29, 2017, 8:53pm

Latest...5.1.1.

warkolm · January 29, 2017, 8:59pm

I'd heard (annecdata) that 121 was a lot more efficient.

DigiAngel · January 29, 2017, 11:26pm

Hrmm....well I'll see if it happens again...twice so far with the same results in syslog.

JKhondhu · January 29, 2017, 11:32pm

Synopsis
The logs provided are from 22/1 - is this a rare occurrence?
There is mention of GNU Krell Monitors running - Are you facing any issues at the operating system such as freezing applications?

Is this running on an iMac or is the iMac the host used to run Ubuntu 14.04.1 (desktop?) running kernel version 4.4.0-59-generic?

DigiAngel · January 29, 2017, 11:46pm

Yea that log was the first time I'd ever seen the oom-killer truth be told. No other issues...just the sudden killing of that java pid. This is running Ubuntu 14.04.1 64 bit on an old imac.:

Linux 4.4.0-59-generic #80~14.04.1-Ubuntu SMP Fri Jan 6 18:02:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Thank you.

mkostrikin · January 30, 2017, 10:48am

Please check your free memory on the server and ElasticSearch heap settings XmX and etc.
If you have not enough memory for Jvm, system will kill jvm.

DigiAngel · January 30, 2017, 9:46pm

Thanks...my question though is if anyone else has noticed this, since a java upgrade. I didn't have this issue previously...if I had then yes..I would agree system memory would be the issue, but in this case I'm seeing this after upgrading from java 111 to java 121, but never before the upgrade. Thanks again.

JKhondhu · January 30, 2017, 10:09pm

This is the first I am hearing of this on Java _121.
Be it that this is on an older iMac running an Ubuntu desktop (Can you confirm server or desktop?)

Leading on from there we would be unaware of any other operating system level conflicts that could be occurring related to the modification to Java. For example, eGNU Krell Monitors running in a not tainted manner.

DigiAngel · January 30, 2017, 10:59pm

This is Ubuntu server 14.04. Thank you....might have been just a fluke....I'll continue monitoring and post results.

AndreCimander · February 1, 2017, 11:16am

We're seeing frequent oom kills for our cluster nodes as well, we had cluster halts before 121 (because nodes got stuck in a GC loop [reporting to the master but failing all queries]), now with 121 the nodes at least correctly terminate.

Linux es-big-16 4.4.0-59-generic #80~14.04.1-Ubuntu SMP Fri Jan 6 18:02:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
oom kern log

We already reduced the queries cache size to 5%, the index buffer to 4% and limited the fielddata cache to 50% (with the breaker set to 57%), we are still trying to find the cause of our issues, maybe our bulk queue size is too long (100k - it's hardly used, though) or our bulk requests are too big or we are doing too many requests per second / node... fishing in muddy waters.

We had no issues on 2.4.x, it all started after upgrading to 5.1.1, 5.1.2 didn't help either.

jasontedor · February 1, 2017, 1:24pm

With the versions reported in this thread (4.4.0-59), this is due to an Ubuntu kernel bug. You should downgrade the kernel to 4.4.0-57.

AndreCimander · February 1, 2017, 2:16pm

D'oh, thanks for the info - I also just noticed that our jvm.options was untouched since 5.1.1, I added the missing netty parameter for the gc deactivation and restartet all nodes. Will post an update tomorrow.

DigiAngel · February 1, 2017, 6:40pm

Yea legit thanks for the info...I can wait until the kernel fix.

DigiAngel · February 1, 2017, 6:40pm

What are your current options Andre for jvm.options? Thanks.

AndreCimander · February 2, 2017, 9:29am

Check if the following parameter exists in your jvm.options (on a side note: is there a path where we can store our customized values to avoid sed-fiddling within the jvm.options?):

-Dio.netty.recycler.maxCapacityPerThread=0

We lost 4 of our nodes over the night, so it probably is indeed the kernel bug. The nodes running on the proposed 4.4.0-62 didn't crash.

On another side note: does anybody notice performance regressions on 14.04/16.04 with the 4.4 kernel? Our nodes running on the wily 4.2 kernel are performing 20-30% better (less cpu usage, less GC times, faster index and search times, load average) and still at least 10% better than our 16.04 nodes...

AndreCimander · February 3, 2017, 9:48am

We upgraded to 5.2 and I added additional nodes to the proposed-kernel pool. The downgraded kernel nodes are looking good, but we lost one node in the 4.4.0-62 kernel pool, so it's probably not fixed yet sigh (the last comment in the kernel bug discussion also indicates that).

I guess we should have stayed with CentOS...

Topic		Replies	Views
Elasticsearch jvm memory outbursts above settings causing oom-kill Elasticsearch	1	263	November 2, 2023
Getting hit by oom-killer, recommended settings? Elasticsearch	2	1387	July 6, 2017
Memory leak? - constant memory use increment every day Elasticsearch	3	545	November 13, 2018
JVM old not populating Elasticsearch	1	463	June 6, 2017
Elasticsearch 5.1 on AWS ubuntu image Elasticsearch	6	1833	January 20, 2017

Anyone else seeing memory issues with java 121?

Related topics