Java 6u23 and ES 0.14.2 crashing on signal (6/SIGABT)


(dbenson) #1

To try and address a non-ES issue, I attempted to upgrade one node in
a cluster from JDK 1.6.0_21 to 1.6.0_23. After several minutes of ES
syncing indexes, it crashes with a SIGABT. From the wrapper logs
(nothing dumped in ES logs)

INFO | jvm 1 | 2011/01/19 18:37:51 | WrapperManager:
Initializing...
STATUS | wrapper | 2011/01/19 18:42:17 | JVM received a signal
UNKNOWN (6).
STATUS | wrapper | 2011/01/19 18:42:17 | JVM process is gone.
ERROR | wrapper | 2011/01/19 18:42:17 | JVM exited unexpectedly.

We're running CentOS 5.5
java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)

Is anyone else running this combination of versions?

Thanks,

David


(Shay Banon) #2

Several people are running ES on 1.6u23, can you give it a go without the wrapper and see if it works? Do you see a crash log by the jvm (in the work directory of where you start it)?
On Wednesday, January 19, 2011 at 9:03 PM, dbenson wrote:

To try and address a non-ES issue, I attempted to upgrade one node in
a cluster from JDK 1.6.0_21 to 1.6.0_23. After several minutes of ES
syncing indexes, it crashes with a SIGABT. From the wrapper logs
(nothing dumped in ES logs)

INFO | jvm 1 | 2011/01/19 18:37:51 | WrapperManager:
Initializing...
STATUS | wrapper | 2011/01/19 18:42:17 | JVM received a signal
UNKNOWN (6).
STATUS | wrapper | 2011/01/19 18:42:17 | JVM process is gone.
ERROR | wrapper | 2011/01/19 18:42:17 | JVM exited unexpectedly.

We're running CentOS 5.5
java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode)

Is anyone else running this combination of versions?

Thanks,

David


(dbenson) #3

I get a cash running from bin/elasticsearch as well. I am getting a
crash dump, which I've uploaded to https://gist.github.com/786752


(Shay Banon) #4

Strange, it seems like it fails in the Lucene PorterStemmer, but I think its just happened to "be there". Can you remove the -XX:+AggressiveOpts added (either in the wrapper conf, or in the elasticsearch.in.sh) and see if it helps?
On Wednesday, January 19, 2011 at 10:06 PM, dbenson wrote:

I get a cash running from bin/elasticsearch as well. I am getting a
crash dump, which I've uploaded to https://gist.github.com/786752


(ijuma) #5

On Jan 19, 8:13 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, it seems like it fails in the Lucene PorterStemmer, but I think its just happened to "be there". Can you remove the -XX:+AggressiveOpts added (either in the wrapper conf, or in the elasticsearch.in.sh) and see if it helps?

For what is worth, using AggressiveOpts is generally a bad idea. Aside
from the stability issues (that are real), it sometimes enables
optimisations tailored for benchmark scores that may cause a slowdown
in certain real-life workloads.

Best,
Ismael


(Shay Banon) #6

My experience is a bit different, AgressiveOpts is a flag used for some time in quite important production systems and it works well (in my previous job)...
On Wednesday, January 19, 2011 at 10:44 PM, ijuma wrote:

On Jan 19, 8:13 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, it seems like it fails in the Lucene PorterStemmer, but I think its just happened to "be there". Can you remove the -XX:+AggressiveOpts added (either in the wrapper conf, or in the elasticsearch.in.sh) and see if it helps?

For what is worth, using AggressiveOpts is generally a bad idea. Aside
from the stability issues (that are real), it sometimes enables
optimisations tailored for benchmark scores that may cause a slowdown
in certain real-life workloads.

Best,
Ismael


(Shay Banon) #7

Grr, hit send prematurely. Where have you seen this flag having bad affect?
On Wednesday, January 19, 2011 at 10:46 PM, Shay Banon wrote:

My experience is a bit different, AgressiveOpts is a flag used for some time in quite important production systems and it works well (in my previous job)...
On Wednesday, January 19, 2011 at 10:44 PM, ijuma wrote:

On Jan 19, 8:13 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, it seems like it fails in the Lucene PorterStemmer, but I think its just happened to "be there". Can you remove the -XX:+AggressiveOpts added (either in the wrapper conf, or in the elasticsearch.in.sh) and see if it helps?

For what is worth, using AggressiveOpts is generally a bad idea. Aside
from the stability issues (that are real), it sometimes enables
optimisations tailored for benchmark scores that may cause a slowdown
in certain real-life workloads.

Best,
Ismael


(dbenson) #8

Removing -XX:+AggressiveOpts did the trick.

Any idea as to why this may have been the culprit? Or perhaps I should
stay on 1.6u21, since u23 didn't address my other issue.

Thanks,

David


(Shay Banon) #9

Well, its good to know. 1.6u23 is new, so this flag might cause it problems, I will remove it from being applied by default (in both the wrapper repo and the default elasticsearch.in.sh). Not sure why it was caused though...

I suggest you stay with 1.6u23 without the flag, even if it doesn't fix the issue.
On Wednesday, January 19, 2011 at 11:05 PM, dbenson wrote:

Removing -XX:+AggressiveOpts did the trick.

Any idea as to why this may have been the culprit? Or perhaps I should
stay on 1.6u21, since u23 didn't address my other issue.

Thanks,

David


(ijuma) #10

On Jan 19, 8:48 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Grr, hit send prematurely. Where have you seen this flag having bad affect?

AggressiveOpts enables a set of optimizations. At some point (not sure
if it still does), it would enable a special HashMap that treated
Integers specially because it made some benchmark faster and according
to Doug Lea this was not a general win (and why it was never enabled
without the switch). At other points, it enabled things that were not
ready and would cause crashes (like escape analysis,
StringBuilder.toString optimisation and so on).

I follow HotSpot pretty closely and I see crashes being fixed for
optimizations that are enabled by AggressiveOpts more often than I
feel comfortable with. With the policy of updating JDK6 with a new
HotSpot every few releases, it's risky enough to use it without
AggressiveOpts in my opinion. I can probably name 4 or 5 major JIT/GC
bugs (that would lead to crashes or miscompilation, were easy to
trigger and/or affected major open-source projects) without
AggressiveOpts in the JDK 6 series.

Best,
Ismael


(Shay Banon) #11

Right, well it used to work pretty well up to 1.6u23 :). I pushed a change to not use it by default :).
On Wednesday, January 19, 2011 at 11:17 PM, ijuma wrote:

On Jan 19, 8:48 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Grr, hit send prematurely. Where have you seen this flag having bad affect?

AggressiveOpts enables a set of optimizations. At some point (not sure
if it still does), it would enable a special HashMap that treated
Integers specially because it made some benchmark faster and according
to Doug Lea this was not a general win (and why it was never enabled
without the switch). At other points, it enabled things that were not
ready and would cause crashes (like escape analysis,
StringBuilder.toString optimisation and so on).

I follow HotSpot pretty closely and I see crashes being fixed for
optimizations that are enabled by AggressiveOpts more often than I
feel comfortable with. With the policy of updating JDK6 with a new
HotSpot every few releases, it's risky enough to use it without
AggressiveOpts in my opinion. I can probably name 4 or 5 major JIT/GC
bugs (that would lead to crashes or miscompilation, were easy to
trigger and/or affected major open-source projects) without
AggressiveOpts in the JDK 6 series.

Best,
Ismael


(ppearcy) #12

FYI, I just hit this issue on one out of three nodes (and repeatedly
on that node) after an upgrade to 15.2 from 14.2. Thought it was worth
mentioning, since it only effects a single node and it started after
the upgrade.

The stack trace looks very similar to the one captured by Ismael. I am
upgrading to the latest service wrapper, now. I expect that to resolve
things.

Thanks,
Paul

On Jan 19, 3:21 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Right, well it used to work pretty well up to 1.6u23 :). I pushed a change to not use it by default :).

On Wednesday, January 19, 2011 at 11:17 PM, ijuma wrote:

On Jan 19, 8:48 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Grr, hit send prematurely. Where have you seen this flag having bad affect?

AggressiveOpts enables a set of optimizations. At some point (not sure
if it still does), it would enable a special HashMap that treated
Integers specially because it made some benchmark faster and according
to Doug Lea this was not a general win (and why it was never enabled
without the switch). At other points, it enabled things that were not
ready and would cause crashes (like escape analysis,
StringBuilder.toString optimisation and so on).

I follow HotSpot pretty closely and I see crashes being fixed for
optimizations that are enabled by AggressiveOpts more often than I
feel comfortable with. With the policy of updating JDK6 with a new
HotSpot every few releases, it's risky enough to use it without
AggressiveOpts in my opinion. I can probably name 4 or 5 major JIT/GC
bugs (that would lead to crashes or miscompilation, were easy to
trigger and/or affected major open-source projects) without
AggressiveOpts in the JDK 6 series.

Best,
Ismael


(system) #13