JVM crash on 64 bit SPARC with Elasticsearch 1.2.2 due to unaligned memory access


(David Roberts-2) #1

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making increasing
use of "unsafe" functions in Java, presumably to speed things up, and some
CPUs are more picky than others about memory alignment. In particular, x86
will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method accepts
an arbitrary offset and passes it to an unsafe function with no check that
it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, long 

seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of 16 

that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than x86?
If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #2

This is really an issue.

I could work on a pull request for better control of Unsafe usage together
with a new setting "jvm.use.unsafe" (or something) which should be true by
default and auto-detectable, so ES could run OOTB also on other JVMs or big
endian platforms.

Jörg

On Tue, Jul 22, 2014 at 11:43 AM, David Roberts <david.roberts2678@gmail.com

wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, long

seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of

16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #3

Agreed that this is an issue! I opened

On Tue, Jul 22, 2014 at 1:33 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is really an issue.

I could work on a pull request for better control of Unsafe usage together
with a new setting "jvm.use.unsafe" (or something) which should be true by
default and auto-detectable, so ES could run OOTB also on other JVMs or big
endian platforms.

Jörg

On Tue, Jul 22, 2014 at 11:43 AM, David Roberts <
david.roberts2678@gmail.com> wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length,

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of

16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7e9JVEogE1E3VOHbexZVFiec9TYLHy70Sg_Ehs-HmR6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(David Roberts-2) #4

Just wanted to say thanks for fixing this so quickly. I can see the code
change is already in the 1.2 branch.

On 22 July 2014 16:35, Adrien Grand adrien.grand@elasticsearch.com wrote:

Agreed that this is an issue! I opened
https://github.com/elasticsearch/elasticsearch/issues/6962

On Tue, Jul 22, 2014 at 1:33 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is really an issue.

I could work on a pull request for better control of Unsafe usage
together with a new setting "jvm.use.unsafe" (or something) which should be
true by default and auto-detectable, so ES could run OOTB also on other
JVMs or big endian platforms.

Jörg

On Tue, Jul 22, 2014 at 11:43 AM, David Roberts <
david.roberts2678@gmail.com> wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length,

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of

16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/e57c7412-4878-4cd5-b21f-72b4d39e98f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFF%3DSG-epttqi%2B%3DwzJC%3DKdN%3DbM1Z3MD7U_vkUgb%3DvZ1Fw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/Nh-kXI5J6Ek/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7e9JVEogE1E3VOHbexZVFiec9TYLHy70Sg_Ehs-HmR6w%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j7e9JVEogE1E3VOHbexZVFiec9TYLHy70Sg_Ehs-HmR6w%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPkEbwVUZFiogncZ88T8BP0qL%3DbAKacTrPh1XATBaeWNdeAPKg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #5

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale out
of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than I
want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:
return UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(),
b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public enum
UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/cardinality/HyperLogLogPlusPlus.java:
return UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/BytesRefComparisonsBenchmark.java:
return UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, long 

seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of 

16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Adrien Grand) #6

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony.aponte@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale out
of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than I
want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing: http://bugs.java.com/
bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/
BytesReference.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k1 = UnsafeUtils.readLongLE(key,
i);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k2 = UnsafeUtils.readLongLE(key, i

  • 8);
    ./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
    if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
    ./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
    } else if (UnsafeUtils.equals(key, get(curId, spare))) {
    ./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
    enum UnsafeUtils {
    ./src/main/java/org/elasticsearch/search/aggregations/metrics/
    cardinality/HyperLogLogPlusPlus.java:import
    org.elasticsearch.common.util.UnsafeUtils;
    ./src/main/java/org/elasticsearch/search/aggregations/metrics/
    cardinality/HyperLogLogPlusPlus.java: return
    UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
    ./src/test/java/org/elasticsearch/benchmark/common/util/
    BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
    UnsafeUtils;
    ./src/test/java/org/elasticsearch/benchmark/common/util/
    BytesRefComparisonsBenchmark.java: return
    UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length,

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of

16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5wVDYCqk4CV82vM%3D-MmihK3HowY_9Bm5Rr%2B5renMHTww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #7

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, <tony....@iqor.com <javascript:>> wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than
I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch shows
that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/
BytesReference.java: return
UnsafeUtils.equals(a.array(), a.arrayOffset(), b.array(), b.arrayOffset(),
a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k1 = UnsafeUtils.readLongLE(key,
i);
./src/main/java/org/elasticsearch/common/hash/
MurmurHash3.java: long k2 = UnsafeUtils.readLongLE(key,
i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/
BytesRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM SIGBUS
error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, 

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple of 

16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Robert Muir-2) #8

How big is it? Maybe i can have it anyway? I pulled two ancient ultrasparcs
out of my closet to try to debug your issue, but unfortunately they are a
pita to work with (dead nvram battery on both, zeroed mac address, etc.) Id
still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony.aponte@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more than
I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L || UnsafeUtils.equals(key,
get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length,

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple

of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such as
SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #9

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir robert.muir@elasticsearch.com
wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony.aponte@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length,

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple

of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEvqztesV_dvZwNVuu-PSRLt4RM--D3dr5kZWJ-NS%2BJ%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #10

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with Oracle
JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml, the
mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
robert.muir@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony.aponte@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length,

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple

of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGmM2SkwFhYfY6w6_gi4WsWoKOx%2BAK9C9ruPPiZAX5W1A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #11

It's as big as my ES_HEAP_SIZE parameter, 30g.

Tony

On Friday, August 22, 2014 10:37:39 PM UTC-4, Robert Muir wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, <tony....@iqor.com <javascript:>> wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker #147}"
daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM core
dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, 

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple 

of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/68b8e599-0221-45ab-95e3-9c5e2759b7a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #12

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml, the
mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <rober...@elasticsearch.com
<javascript:>> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, <tony....@iqor.com <javascript:>> wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int 

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher 

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/76a4b152-6d84-444f-a7bc-45764f717dde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #13

I was able to trim the heap size and, consequently, the core file down to
about 530m.

Tony

On Monday, August 25, 2014 3:41:14 PM UTC-4, tony....@iqor.com wrote:

It's as big as my ES_HEAP_SIZE parameter, 30g.

Tony

On Friday, August 22, 2014 10:37:39 PM UTC-4, Robert Muir wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core dump
written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to scale
out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int length, 

long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher multiple 

of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures such
as SPARC, Itanium and PowerPC that don't support unaligned 64 bit memory
access.

Does Elasticsearch have any policy for support of hardware other than
x86? If not, I don't think many people would care but you really ought to
clearly say so on your platform support page. If you do intend to support
non-x86 architectures then you need to be much more careful about the use
of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54901f16-a43e-4508-abc3-dce2e9ab88a4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #14

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : { "index.refresh_interval" :
"5s" }, "mappings" : { "default" : { "_all" : {"enabled" :
true}, "dynamic_templates" : [ { "string_fields" : {
"match" : "
", "match_mapping_type" : "string",
"mapping" : { "type" : "string", "index" : "analyzed",
"omit_norms" : true, "fields" : { "raw" :
{"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
} } } } ], "properties" : {
"@version": { "type": "string", "index": "not_analyzed" }, "geoip"
: { "type" : "object", "dynamic": true,
"path": "full", "properties" : { "location" : {
"type" : "geo_point" } } } } } }}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml, the
mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com joerg...@gmail.com
wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <rober...@elasticsearch.com

wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/elasticsearch-1.3.2/core

or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.
java: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import org.elasticsearch.common.util.
UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128 method
accepts an arbitrary offset and passes it to an unsafe function with no
check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int 

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher 

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-
b63e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #15

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

There has been a fix in LZF lately

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : { "index.refresh_interval"
: "5s" }, "mappings" : { "default" : { "_all" : {"enabled" :
true}, "dynamic_templates" : [ { "string_fields" : {
"match" : "
", "match_mapping_type" : "string",
"mapping" : { "type" : "string", "index" : "analyzed",
"omit_norms" : true, "fields" : { "raw" :
{"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
} } } } ], "properties" : {
"@version": { "type": "string", "index": "not_analyzed" }, "geoip"
: { "type" : "object", "dynamic": true,
"path": "full", "properties" : { "location" : {
"type" : "geo_point" } } } } } }}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <joerg...@gmail.com

wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try to
reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/

elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE more
than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build

1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of Elasticsearch
shows that the new use of "unsafe" memory access functions is in the
BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1
YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEmaMFyEuxw8tVch8jcXmshdjhK%3D_3Go2%3D%3DPGQ8-ufhEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #16

Still broken with lzf-compress 1.0.3

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

There has been a fix in LZF lately
https://github.com/ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : { "index.refresh_interval"
: "5s" }, "mappings" : { "default" : { "_all" : {"enabled" :
true}, "dynamic_templates" : [ { "string_fields" : {
"match" : "
", "match_mapping_type" : "string",
"mapping" : { "type" : "string", "index" : "analyzed",
"omit_norms" : true, "fields" : { "raw" :
{"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
} } } } ], "properties" : {
"@version": { "type": "string", "index": "not_analyzed" }, "geoip"
: { "type" : "object", "dynamic": true,
"path": "full", "properties" : { "location" : {
"type" : "geo_point" } } } } } }}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains actual
server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6", "s7"]
<===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try
to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the process
image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode

solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/

elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting JVM
core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if (UnsafeUtils.equals(key,
get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the JVM
SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1
YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGN-Dx-toi1gmAhFkmqT8RV6BdOF9ZtcsZmhu9iTkX75Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #17

I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Still broken with lzf-compress 1.0.3

https://gist.github.com/jprante/d2d829b497db4963aea5

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

There has been a fix in LZF lately
https://github.com/ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "
", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level" to
DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try
to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the "Core
dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location: /export/home/elasticsearch/

elasticsearch-1.3.2/core or core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is making
increasing use of "unsafe" functions in Java, presumably to speed things
up, and some CPUs are more picky than others about memory alignment. In
particular, x86 will tolerate misaligned memory access whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i + 8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware other
than x86? If not, I don't think many people would care but you really
ought to clearly say so on your platform support page. If you do intend to
support non-x86 architectures then you need to be much more careful about
the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #18

Amazing job. Great work.

--
Ivan

On Tue, Aug 26, 2014 at 12:41 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

https://github.com/elasticsearch/elasticsearch/pull/7466

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Still broken with lzf-compress 1.0.3

https://gist.github.com/jprante/d2d829b497db4963aea5

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

There has been a fix in LZF lately
https://github.com/ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this
works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:

I captured a WireShark trace of the interaction between ES and Logstash
1.4.1. The error occurs even before my data is sent. Can you try to
reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "
", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level"
to DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in elasticsearch.yml,
the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can try
to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the
"Core dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server to
scale out of small x86 machine. I get a similar exception running ES with
JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or
core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O worker
#147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is
making increasing use of "unsafe" functions in Java, presumably to speed
things up, and some CPUs are more picky than others about memory
alignment. In particular, x86 will tolerate misaligned memory access
whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i +
8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on architectures
such as SPARC, Itanium and PowerPC that don't support unaligned 64 bit
memory access.

Does Elasticsearch have any policy for support of hardware
other than x86? If not, I don't think many people would care but you
really ought to clearly say so on your platform support page. If you do
intend to support non-x86 architectures then you need to be much more
careful about the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #19

All praise should go to the fantastic Elasticsearch team who did not
hesitate to test the fix immediately and replaced it with a better working
solution, since the lzf-compress software is having weaknesses regarding
threadsafety.

Jörg

On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic ivan@brusic.com wrote:

Amazing job. Great work.

--
Ivan

On Tue, Aug 26, 2014 at 12:41 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

https://github.com/elasticsearch/elasticsearch/pull/7466

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Still broken with lzf-compress 1.0.3

https://gist.github.com/jprante/d2d829b497db4963aea5

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

There has been a fix in LZF lately
https://github.com/ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this
works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, tony.aponte@iqor.com wrote:

I captured a WireShark trace of the interaction between ES and
Logstash 1.4.1. The error occurs even before my data is sent. Can you try
to reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "
", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level"
to DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2 with
Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default settings.

No issues.

So I would like to know more about the settings in
elasticsearch.yml, the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can
try to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the
"Core dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server
to scale out of small x86 machine. I get a similar exception running ES
with JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or
core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O
worker #147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is
making increasing use of "unsafe" functions in Java, presumably to speed
things up, and some CPUs are more picky than others about memory
alignment. In particular, x86 will tolerate misaligned memory access
whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i +
8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metrics/
cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on
architectures such as SPARC, Itanium and PowerPC that don't support
unaligned 64 bit memory access.

Does Elasticsearch have any policy for support of hardware
other than x86? If not, I don't think many people would care but you
really ought to clearly say so on your platform support page. If you do
intend to support non-x86 architectures then you need to be much more
careful about the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHLmXs3tp9KPBin9dpr0oU9YA%2B4kgPvcOFtD%2BytPdLd5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(tony.aponte) #20

Kudos!

Tony

On Wednesday, August 27, 2014 1:16:11 PM UTC-4, Jörg Prante wrote:

All praise should go to the fantastic Elasticsearch team who did not
hesitate to test the fix immediately and replaced it with a better working
solution, since the lzf-compress software is having weaknesses regarding
threadsafety.

Jörg

On Wed, Aug 27, 2014 at 7:01 PM, Ivan Brusic <iv...@brusic.com
<javascript:>> wrote:

Amazing job. Great work.

--
Ivan

On Tue, Aug 26, 2014 at 12:41 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

I fixed the issue by setting the safe LZF encoder in LZFCompressor and
opened a pull request

https://github.com/elasticsearch/elasticsearch/pull/7466

Jörg

On Tue, Aug 26, 2014 at 8:17 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

Still broken with lzf-compress 1.0.3

https://gist.github.com/jprante/d2d829b497db4963aea5

Jörg

On Tue, Aug 26, 2014 at 7:54 PM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

Thanks for the logstash mapping command. I can reproduce it now.

It's the LZF encoder that bails out at
org.elasticsearch.common.compress.lzf.impl.UnsafeChunkEncoderBE._getInt

which uses in turn sun.misc.Unsafe.getInt

I have created a gist of the JVM crash file at

https://gist.github.com/jprante/79f4b4c0b9fd83eb1c9b

There has been a fix in LZF lately
https://github.com/ning/compress/commit/db7f51bddc5b7beb47da77eeeab56882c650bff7

for version 1.0.3 which has been released recently.

I will build a snapshot ES version with LZF 1.0.3 and see if this
works...

Jörg

On Mon, Aug 25, 2014 at 11:30 PM, <tony....@iqor.com <javascript:>>
wrote:

I captured a WireShark trace of the interaction between ES and
Logstash 1.4.1. The error occurs even before my data is sent. Can you try
to reproduce it on your testbed with this message I captured?

curl -XPUT http://amssc103-mgmt-app2:9200/_template/logstash -d @y

Contests of file 'y":
{ "template" : "logstash-", "settings" : {
"index.refresh_interval" : "5s" }, "mappings" : { "default" : {
"_all" : {"enabled" : true}, "dynamic_templates" : [ {
"string_fields" : { "match" : "
", "match_mapping_type"
: "string", "mapping" : { "type" : "string", "index"
: "analyzed", "omit_norms" : true, "fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" :
256} } } } } ], "properties" :
{ "@version": { "type": "string", "index": "not_analyzed" },
"geoip" : { "type" : "object", "dynamic": true,
"path": "full", "properties" : {
"location" : { "type" : "geo_point" } } } } }
}}

On Monday, August 25, 2014 3:53:18 PM UTC-4, tony....@iqor.com wrote:

I have no plugins installed (yet) and only changed "es.logger.level"
to DEBUG in logging.yml.

elasticsearch.yml:
cluster.name: es-AMS1Cluster
node.name: "KYLIE1"
node.rack: amssc2client02
path.data: /export/home/apontet/elasticsearch/data
path.work: /export/home/apontet/elasticsearch/work
path.logs: /export/home/apontet/elasticsearch/logs
network.host: ******** <===== sanitized line; file contains
actual server IP
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["s1", "s2", "s3", "s5" , "s6",
"s7"] <===== Also sanitized

Thanks,
Tony

On Saturday, August 23, 2014 6:29:40 AM UTC-4, Jörg Prante wrote:

I tested a simple "Hello World" document on Elasticsearch 1.3.2
with Oracle JDK 1.7.0_17 64-bit Server VM, Sparc Solaris 10, default
settings.

No issues.

So I would like to know more about the settings in
elasticsearch.yml, the mappings, and the installed plugins.

Jörg

On Sat, Aug 23, 2014 at 11:25 AM, joerg...@gmail.com <
joerg...@gmail.com> wrote:

I have some Solaris 10 Sparc V440/V445 servers available and can
try to reproduce over the weekend.

Jörg

On Sat, Aug 23, 2014 at 4:37 AM, Robert Muir <
rober...@elasticsearch.com> wrote:

How big is it? Maybe i can have it anyway? I pulled two ancient
ultrasparcs out of my closet to try to debug your issue, but unfortunately
they are a pita to work with (dead nvram battery on both, zeroed mac
address, etc.) Id still love to get to the bottom of this.
On Aug 22, 2014 3:59 PM, tony....@iqor.com wrote:

Hi Adrien,
It's a bunch of garbled binary data, basically a dump of the
process image.
Tony

On Thursday, August 21, 2014 6:36:12 PM UTC-4, Adrien Grand
wrote:

Hi Tony,

Do you have more information in the core dump file? (cf. the
"Core dump written" line that you pasted)

On Thu, Aug 21, 2014 at 7:53 PM, tony....@iqor.com wrote:

Hello,
I installed ES 1.3.2 on a spare Solaris 11/ T4-4 SPARC server
to scale out of small x86 machine. I get a similar exception running ES
with JAVA_OPTS=-d64. When Logstash 1.4.1 sends the first message I get the
error below on the ES process:

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7a9a3d8c, pid=14473, tid=209

JRE version: 7.0_25-b15

Java VM: Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xba3d8c] Unsafe_GetInt+0x158

Core dump written. Default location:

/export/home/elasticsearch/elasticsearch-1.3.2/core or
core.14473

If you would like to submit a bug report, please visit:

http://bugreport.sun.com/bugreport/crash.jsp

--------------- T H R E A D ---------------

Current thread (0x0000000107078000): JavaThread
"elasticsearch[KYLIE1][http_server_worker][T#17]{New I/O
worker #147}" daemon [_thread_in_vm, id=209, stack(0xffffffff5b800000,
0xffffffff5b840000)]

siginfo:si_signo=SIGBUS: si_errno=0, si_code=1 (BUS_ADRALN),
si_addr=0x0000000709cc09e7

I can run ES using 32bit java but have to shrink ES_HEAPS_SIZE
more than I want to. Any assistance would be appreciated.

Regards,
Tony

On Tuesday, July 22, 2014 5:43:28 AM UTC-4, David Roberts
wrote:

Hello,

After upgrading from Elasticsearch 1.0.1 to 1.2.2 I'm getting
JVM core dumps on Solaris 10 on SPARC.

A fatal error has been detected by the Java Runtime

Environment:

SIGBUS (0xa) at pc=0xffffffff7e452d78, pid=15483, tid=263

JRE version: Java(TM) SE Runtime Environment (7.0_55-b13)

(build 1.7.0_55-b13)

Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed

mode solaris-sparc compressed oops)

Problematic frame:

V [libjvm.so+0xc52d78] Unsafe_GetLong+0x158

I'm pretty sure the problem here is that Elasticsearch is
making increasing use of "unsafe" functions in Java, presumably to speed
things up, and some CPUs are more picky than others about memory
alignment. In particular, x86 will tolerate misaligned memory access
whereas SPARC won't.

Somebody has tried to report this to Oracle in the past and
(understandably) Oracle has said that if you're going to use unsafe
functions you need to understand what you're doing:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021574

A quick grep through the code of the two versions of
Elasticsearch shows that the new use of "unsafe" memory access functions is
in the BytesReference, MurmurHash3 and HyperLogLogPlusPlus classes:

bash-3.2$ git checkout v1.0.1
Checking out files: 100% (2904/2904), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: if (id == -1L ||
UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/search/aggregations/bucket/
BytesRefHash.java: } else if
(UnsafeUtils.equals(key, get(curId, spare))) {
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

bash-3.2$ git checkout v1.2.2
Checking out files: 100% (2220/2220), done.

bash-3.2$ find . -name '*.java' | xargs grep UnsafeUtils
./src/main/java/org/elasticsearch/common/bytes/BytesReference.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/bytes/BytesReferenc
e.java: return UnsafeUtils.equals(a.array(),
a.arrayOffset(), b.array(), b.arrayOffset(), a.length());
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.java:
return UnsafeUtils.readLongLE(key, blockOffset);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k1 = UnsafeUtils.readLongLE(key, i);
./src/main/java/org/elasticsearch/common/hash/MurmurHash3.ja
va: long k2 = UnsafeUtils.readLongLE(key, i +
8);
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
if (id == -1L || UnsafeUtils.equals(key, get(id, spare))) {
./src/main/java/org/elasticsearch/common/util/BytesRefHash.java:
} else if (UnsafeUtils.equals(key, get(curId, spare))) {
./src/main/java/org/elasticsearch/common/util/UnsafeUtils.java:public
enum UnsafeUtils {
./src/main/java/org/elasticsearch/search/aggregations/metric
s/cardinality/HyperLogLogPlusPlus.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/main/java/org/elasticsearch/search/aggregations/metric
s/cardinality/HyperLogLogPlusPlus.java: return
UnsafeUtils.readIntLE(readSpare.bytes, readSpare.offset);
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java:import
org.elasticsearch.common.util.UnsafeUtils;
./src/test/java/org/elasticsearch/benchmark/common/util/Byte
sRefComparisonsBenchmark.java: return
UnsafeUtils.equals(b1, b2);

Presumably one of these three new uses is what is causing the
JVM SIGBUS error I'm seeing.

A quick look at the MurmurHash3 class shows that the hash128
method accepts an arbitrary offset and passes it to an unsafe function with
no check that it's a multiple of 8:

public static Hash128 hash128(byte[] key, int offset, int 

length, long seed, Hash128 hash) {
long h1 = seed;
long h2 = seed;

    if (length >= 16) {

        final int len16 = length & 0xFFFFFFF0; // higher 

multiple of 16 that is lower than or equal to length
final int end = offset + len16;
for (int i = offset; i < end; i += 16) {
long k1 = UnsafeUtils.readLongLE(key, i);
long k2 = UnsafeUtils.readLongLE(key, i + 8);

This is a recipe for generating JVM core dumps on
architectures such as SPARC, Itanium and PowerPC that don't support
unaligned 64 bit memory access.

Does Elasticsearch have any policy for support of hardware
other than x86? If not, I don't think many people would care but you
really ought to clearly say so on your platform support page. If you do
intend to support non-x86 architectures then you need to be much more
careful about the use of unsafe memory accesses.

Regards,

David

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63
e-4c2e-87c3-029fc58449fc%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/eb7f4c23-b63e-4c2e-87c3-029fc58449fc%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Adrien Grand

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/12aa33de-
ccc7-485a-8c52-562f3e91a535%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/12aa33de-ccc7-485a-8c52-562f3e91a535%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the
Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/
CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%
40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAMUKNZXOKeJq8Datx2KY7cSfJXDH1YGDNmQjNWDQ2jci%3DfN31Q%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c62191ea-543b-462d-95e9-aff125c0a6f0%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHrOOqhgOSiRhmweSR5wLs%2BJiO70_CSRO%2BFS2zOU9VKzg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAvCoJwN8tSJXa8%3DZHMDYw_mpHc0Q866fcso_1LZCFiyw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/487524ad-1cce-48f0-8a09-dc49227cca9a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.