JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access

David_Roberts_2 · July 22, 2014, 9:43am

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making increasing
use of "unsafe" functions in Java, presumably to speed things up, and some
CPUs are more picky than others about memory alignment. In particular, x86
will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method accepts
an arbitrary offset and passes it to an unsafe function with no check that
it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, long

seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of 16

that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than x86?
If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · July 22, 2014, 11:33am

This is really an issue.

I could work on a pull request for better control of Unsafe usage together
with a new setting "jvm.use.unsafe" (or something) which should be true by
default and auto-detectable, so ES could run OOTB also on other JVMs or big
endian platforms.

Jörg

On Tue, Jul 22, 2014 at 11:43 AM, David Roberts <david.roberts2678@gmail.com

wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length, long
seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of
16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jpountz · July 22, 2014, 3:35pm

Agreed that this is an issue! I opened

On Tue, Jul 22, 2014 at 1:33 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is really an issue.

I could work on a pull request for better control of Unsafe usage together
with a new setting "jvm.use.unsafe" (or something) which should be true by
default and auto-detectable, so ES could run OOTB also on other JVMs or big
endian platforms.

Jörg

On Tue, Jul 22, 2014 at 11:43 AM, David Roberts <
david.roberts2678@gmail.com> wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length,
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of
16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7e9JVEogE1E3VOHbexZVFiec9TYLHy70Sg_Ehs-HmR6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

David_Roberts_2 · July 23, 2014, 8:32am

Just wanted to say thanks for fixing this so quickly. I can see the code
change is already in the 1.2 branch.

On 22 July 2014 16:35, Adrien Grand adrien.grand@elasticsearch.com wrote:

Agreed that this is an issue! I opened
Internal: Remove unsafe unaligned memory access - illegal on SPARC · Issue #6962 · elastic/elasticsearch · GitHub

On Tue, Jul 22, 2014 at 1:33 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
This is really an issue.

I could work on a pull request for better control of Unsafe usage
together with a new setting "jvm.use.unsafe" (or something) which should be
true by default and auto-detectable, so ES could run OOTB also on other
JVMs or big endian platforms.

Jörg

On Tue, Jul 22, 2014 at 11:43 AM, David Roberts <
david.roberts2678@gmail.com> wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length,
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of
16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Nh-kXI5J6Ek/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7e9JVEogE1E3VOHbexZVFiec9TYLHy70Sg_Ehs-HmR6w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7e9JVEogE1E3VOHbexZVFiec9TYLHy70Sg_Ehs-HmR6w%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPkEbwVUZFiogncZ88T8BP0qL%3DbAKacTrPh1XATBaeWNdeAPKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 21, 2014, 5:53pm

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale out
of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than I
want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length, long 
seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of 
16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jpountz · August 21, 2014, 10:36pm

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony.aponte@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale out
of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than I
want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing: http://bugs.java.com/
bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/
BytesReference.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k1 = UnsafeUtils.readLongLE(key,
i);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k2 = UnsafeUtils.readLongLE(key, i

8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length,
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of
16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5wVDYCqk4CV82vM%3D-MmihK3HowY_9Bm5Rr%2B5renMHTww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 22, 2014, 7:58pm

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, <tony....@iqor.com <javascript:>> wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than
I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/
BytesReference.java: return
UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(), b.arrayOffset(),
a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k1 = UnsafeUtils.readLongLE(key,
i);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k2 = UnsafeUtils.readLongLE(key,
i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length, 
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of 
16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Robert_Muir_2 · August 23, 2014, 2:37am

How big is it? Maybe i can have it anyway? I pulled two ancient ultrasparcs
out of my closet to try to debug your issue, but unfortunately they are a
pita to work with (dead nvram battery on both, zeroed mac address, etc.) Id
still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony.aponte@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than
I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L || UnsafeUtils.equals(key,
get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length,
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple
of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 23, 2014, 9:25am

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir robert.muir@elasticsearch.com
wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony.aponte@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length,
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple
of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEvqztesV_dvZwNVuu-PSRLt4RM--D3dr5kZWJ-NS%2BJ%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 23, 2014, 10:29am

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with Oracle
JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml, the
mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
robert.muir@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony.aponte@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length,
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple
of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGmM2SkwFhYfY6w6_gi4WsWoKOx%2BAK9C9ruPPiZAX5W1A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 25, 2014, 7:41pm

It's as big as my ES_HEAP_SIZE parameter, 30g.

Tony

On Friday, August 22, 2014 10:37:39 PM UTC-4, Robert Muir wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, <tony....@iqor.com <javascript:>> wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length, 
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple 
of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/68b8e599-0221-45ab-95e3-9c5e2759b7a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 25, 2014, 7:53pm

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml, the
mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <rober...@elasticsearch.com
<javascript:>> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, <tony....@iqor.com <javascript:>> wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int 
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher 
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76a4b152-6d84-444f-a7bc-45764f717dde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 25, 2014, 8:31pm

I was able to trim the heap size and, consequently, the core file down to
about 530m.

Tony

On Monday, August 25, 2014 3:41:14 PM UTC-4, tony....@iqor.com wrote:

It's as big as my ES_HEAP_SIZE parameter, 30g.

Tony

On Friday, August 22, 2014 10:37:39 PM UTC-4, Robert Muir wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int length, 
long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple 
of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54901f16-a43e-4508-abc3-dce2e9ab88a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 25, 2014, 9:30pm

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : { "index.refresh_interval" :
"5s" }, "mappings" : { "default" : { "_all" : {"enabled" :
true}, "dynamic_templates" : [ { "string_fields" : {
"match" : "", "match_mapping_type" : "string",
"mapping" : { "type" : "string", "index" : "analyzed",
"omit_norms" : true, "fields" : { "raw" :
{"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
} } } } ], "properties" : {
"@version": { "type": "string", "index": "not_analyzed" }, "geoip"
: { "type" : "object", "dynamic": true,
"path": "full", "properties" : { "location" : {
"type" : "geo_point" } } } } } }}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml, the
mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com joerg...@gmail.com
wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <rober...@elasticsearch.com

wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int 
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher 
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-
b63e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 26, 2014, 5:54pm

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

gist.github.com

https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

hs_err_pid23430.log

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0xffffffff7e51d838, pid=23430, tid=51
#
# JRE version: Java(TM) SE Runtime Environment (8.0_11-b12) (build 1.8.0_11-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.11-b03 mixed mode solaris-sparc compressed oops)
# Problematic frame:
# V  [libjvm.so+0xd1d838]  Unsafe_GetInt+0x174
#

This file has been truncated. show original

There has been a fix in LZF lately

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : { "index.refresh_interval"
: "5s" }, "mappings" : { "default" : { "_all" : {"enabled" :
true}, "dynamic_templates" : [ { "string_fields" : {
"match" : "", "match_mapping_type" : "string",
"mapping" : { "type" : "string", "index" : "analyzed",
"omit_norms" : true, "fields" : { "raw" :
{"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
} } } } ], "properties" : {
"@version": { "type": "string", "index": "not_analyzed" }, "geoip"
: { "type" : "object", "dynamic": true,
"path": "full", "properties" : { "location" : {
"type" : "geo_point" } } } } } }}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:
I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <joerg...@gmail.com

wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/

elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1
YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEmaMFyEuxw8tVch8jcXmshdjhK%3D_3Go2%3D%3DPGQ8-ufhEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 26, 2014, 6:17pm

Still broken with lzf-compress 1.0.3

gist.github.com

https://gist.github.com/jprante/d2d829b497db4963aea5

hs_err_pid23875.log

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0xffffffff7e51d838, pid=23875, tid=48
#
# JRE version: Java(TM) SE Runtime Environment (8.0_11-b12) (build 1.8.0_11-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.11-b03 mixed mode solaris-sparc compressed oops)
# Problematic frame:
# V  [libjvm.so+0xd1d838]  Unsafe_GetInt+0x174
#

This file has been truncated. show original

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

Solaris SPARC core dump with Java 8u11 64bit and Elasticsearch 1.3.2 · GitHub

There has been a fix in LZF lately
Streamling #37 fix a bit; would be neat to have a machine to run BE t… · ning/compress@db7f51b · GitHub

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:
I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : { "index.refresh_interval"
: "5s" }, "mappings" : { "default" : { "_all" : {"enabled" :
true}, "dynamic_templates" : [ { "string_fields" : {
"match" : "", "match_mapping_type" : "string",
"mapping" : { "type" : "string", "index" : "analyzed",
"omit_norms" : true, "fields" : { "raw" :
{"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
} } } } ], "properties" : {
"@version": { "type": "string", "index": "not_analyzed" }, "geoip"
: { "type" : "object", "dynamic": true,
"path": "full", "properties" : { "location" : {
"type" : "geo_point" } } } } } }}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:
I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can try
to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/

elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1
YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGN-Dx-toi1gmAhFkmqT8RV6BdOF9ZtcsZmhu9iTkX75Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 26, 2014, 7:41pm

I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

github.com/elastic/elasticsearch

Add LZF safe encoder in LZFCompressor

elastic:master ← jprante:safe-encoder-compress

opened 07:38PM - 26 Aug 14 UTC

jprante

+12 -6

Selecting the safe encoder fixes a 64bit JVM crash on big-endian architectures w…ith LZF UnsafeChunkEncoderBE. Example of such a big-endian architecture is Solaris SPARC 64bit (another one is POWER). Without safe encoder, LZF uses the unsafe encoder, and crashes when for example this command is executed ``` PUT /_template/logstash { "template" : "logstash-*", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "_default_" : { "_all" : { "enabled" : true }, "dynamic_templates" : [ { "string_fields" : { "match" : "*", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fields" : { "raw" : { "type": "string", "index" : "not_analyzed", "ignore_above" : 256 } } } } } ], "properties" : { "@version": { "type": "string", "index": "not_analyzed" }, "geoip" : { "type" : "object", "dynamic": true, "path": "full", "properties" : { "location" : { "type" : "geo_point" } } } } } } } ``` A crash file is available at https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Still broken with lzf-compress 1.0.3

Solaris SPARC JVM 64bit crash with Java 8u11 and Elasticsearch 1.3.3-SNAPSHOT (with lzf-compress 1.0.3) · GitHub

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

Solaris SPARC core dump with Java 8u11 64bit and Elasticsearch 1.3.2 · GitHub

There has been a fix in LZF lately
Streamling #37 fix a bit; would be neat to have a machine to run BE t… · ning/compress@db7f51b · GitHub

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:
I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:
I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can try
to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/

elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Ivan · August 27, 2014, 5:01pm

Amazing job. Great work.

--
Ivan

On Tue, Aug 26, 2014 at 12:41 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

Add LZF safe encoder in LZFCompressor by jprante · Pull Request #7466 · elastic/elasticsearch · GitHub

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
Still broken with lzf-compress 1.0.3

Solaris SPARC JVM 64bit crash with Java 8u11 and Elasticsearch 1.3.3-SNAPSHOT (with lzf-compress 1.0.3) · GitHub

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

Solaris SPARC core dump with Java 8u11 64bit and Elasticsearch 1.3.2 · GitHub

There has been a fix in LZF lately
Streamling #37 fix a bit; would be neat to have a machine to run BE t… · ning/compress@db7f51b · GitHub

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this
works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:
I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:
I have no plugins installed (yet) and only changed "es.logger.level"
to DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can try
to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the
"Core dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or
core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is
making increasing use of "unsafe" functions in Java, presumably to speed
things up, and some CPUs are more picky than others about memory
alignment. In particular, x86 will tolerate misaligned memory access
whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i +
8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware
other than x86? If not, I don't think many people would care but you
really ought to clearly say so on your platform support page. If you do
intend to support non-x86 architectures then you need to be much more
careful about the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · August 27, 2014, 5:16pm

All praise should go to the fantastic Elasticsearch team who did not
hesitate to test the fix immediately and replaced it with a better working
solution, since the lzf-compress software is having weaknesses regarding
threadsafety.

Jörg

On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic ivan@brusic.com wrote:

Amazing job. Great work.

--
Ivan

On Tue, Aug 26, 2014 at 12:41 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

Add LZF safe encoder in LZFCompressor by jprante · Pull Request #7466 · elastic/elasticsearch · GitHub

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
Still broken with lzf-compress 1.0.3

Solaris SPARC JVM 64bit crash with Java 8u11 and Elasticsearch 1.3.3-SNAPSHOT (with lzf-compress 1.0.3) · GitHub

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:
Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

Solaris SPARC core dump with Java 8u11 64bit and Elasticsearch 1.3.2 · GitHub

There has been a fix in LZF lately
Streamling #37 fix a bit; would be neat to have a machine to run BE t… · ning/compress@db7f51b · GitHub

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this
works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:
I captured a WireShark trace of the interaction between ES and
Logstash 1.4.1. The error occurs even before my data is sent. Can you try
to reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:
I have no plugins installed (yet) and only changed "es.logger.level"
to DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in
elasticsearch.yml, the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can
try to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the
"Core dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server
to scale out of small x86 machine. I get a similar exception running ES
with JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or
core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O
worker #147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is
making increasing use of "unsafe" functions in Java, presumably to speed
things up, and some CPUs are more picky than others about memory
alignment. In particular, x86 will tolerate misaligned memory access
whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i +
8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on
architectures such as SPARC, Itanium and PowerPC that don't support
unaligned 64 bit memory access.

Does Elasticsearch have any policy for support of hardware
other than x86? If not, I don't think many people would care but you
really ought to clearly say so on your platform support page. If you do
intend to support non-x86 architectures then you need to be much more
careful about the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHLmXs3tp9KPBin9dpr0oU9YA%2B4kgPvcOFtD%2BytPdLd5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

tony_aponte · August 27, 2014, 5:35pm

Kudos!

Tony

On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

All praise should go to the fantastic Elasticsearch team who did not
hesitate to test the fix immediately and replaced it with a better working
solution, since the lzf-compress software is having weaknesses regarding
threadsafety.

Jörg

On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic <iv...@brusic.com
<javascript:>> wrote:
Amazing job. Great work.

--
Ivan

On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:
I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

Add LZF safe encoder in LZFCompressor by jprante · Pull Request #7466 · elastic/elasticsearch · GitHub

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:
Still broken with lzf-compress 1.0.3

Solaris SPARC JVM 64bit crash with Java 8u11 and Elasticsearch 1.3.3-SNAPSHOT (with lzf-compress 1.0.3) · GitHub

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:
Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

Solaris SPARC core dump with Java 8u11 64bit and Elasticsearch 1.3.2 · GitHub

There has been a fix in LZF lately
Streamling #37 fix a bit; would be neat to have a machine to run BE t… · ning/compress@db7f51b · GitHub

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this
works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, <tony....@iqor.com <javascript:>>
wrote:
I captured a WireShark trace of the interaction between ES and
Logstash 1.4.1. The error occurs even before my data is sent. Can you try
to reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:
I have no plugins installed (yet) and only changed "es.logger.level"
to DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:
I tested a simple "Hello World" document on Elasticsearch 1.3.2
with Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default
settings.

No issues.

So I would like to know more about the settings in
elasticsearch.yml, the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:
I have some Solaris 10 Sparc V440/V445 servers available and can
try to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:
How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:
Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand
wrote:
Hi Tony,

Do you have more information in the core dump file? (cf. the
"Core dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:
Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server
to scale out of small x86 machine. I get a similar exception running ES
with JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or
core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O
worker #147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts
wrote:
Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is
making increasing use of "unsafe" functions in Java, presumably to speed
things up, and some CPUs are more picky than others about memory
alignment. In particular, x86 will tolerate misaligned memory access
whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
Bug Database

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i +
8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metric
s/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metric
s/cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:
public static Hash128 hash128(byte[] key, int offset, int 
length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;
    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher 
multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on
architectures such as SPARC, Itanium and PowerPC that don't support
unaligned 64 bit memory access.

Does Elasticsearch have any policy for support of hardware
other than x86? If not, I don't think many people would care but you
really ought to clearly say so on your platform support page. If you do
intend to support non-x86 architectures then you need to be much more
careful about the use of unsafe memory accesses.

Regards,

David
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
Adrien Grand
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/487524ad-1cce-48f0-8a09-dc49227cca9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Elasticsearch crashing after a while Elasticsearch	4	1027	July 6, 2017
ES Crash Elasticsearch	4	2586	July 6, 2017
JVM crashes when trying to run elasticsearch Elasticsearch	4	1769	July 6, 2017
ES fails to start when using 64 bit java Elasticsearch	4	1849	July 6, 2017
ElasticSearch 0.19.8 - JVM crash on solaris Elasticsearch	11	589	July 6, 2017