Elasticsearch process dies when using G1GC


(Christopher J. Bottaro) #1

When I enable G1GC, Elasticsearch won't stay running for more than a few
minutes. When it dies, nothing is output in the logs. All I see is an
entry in syslog:

init: elasticsearch main process (17253) terminated with status 134

Here are my G1 settings:

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"
JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions"
JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=400"
JAVA_OPTS="$JAVA_OPTS -XX:GCPauseIntervalMillis=8000"

I've also tried 50/100 (pause/interval).

My Java version:

java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

I'm running on EC2, using a cluster of 4 m1.xlarge with 5 gb allocated to
Elasticsearch and mlockall true.

The workload is distributed bulk indexing. I have 600 processes each in a
loop, submitting bulk index requests of 200 documents at a time.

As a side note, if I use UseParNewGC and UseConcMarkSweepGC, eventually I
get stop-the-world pauses (of ~5s) every few seconds on each node. Which
renders the cluster useless.

How can I get G1 to work? Or how can I stop CMS from stopping the world
several times per minute?

Thank for the help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #2

Do not use -XX:+UnlockExperimentalVMOptions or you will trigger bugs.
This is for developers living on bleeding edge.

Jörg

Am 04.06.13 02:44, schrieb Christopher J. Bottaro:

JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions"

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Zerocoolys) #3

I found when using the G1 GC, elasticsearch will core dump for a few minutes.
The root cause from thread dump shows

Problematic frame:

J org.elasticsearch.common.io.stream.HandlesStreamOutput.writeString(Ljava/lang/String;)V

This problem only happen during indexing.
When change back to CMS GC , everything goes well.


(Christopher J. Bottaro) #4

Exact same result, with or without -XX:+UnlockExperimentalVMOptions.

Thanks for the help.

On Tuesday, June 4, 2013 2:33:28 AM UTC-5, Jörg Prante wrote:

Do not use -XX:+UnlockExperimentalVMOptions or you will trigger bugs.
This is for developers living on bleeding edge.

Jörg

Am 04.06.13 02:44, schrieb Christopher J. Bottaro:

JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions"

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jon Shea) #5

Hi Christopher,

I’ve had the best results using ParallelGC. We haven’t profiled the G1 yet,
but I have done fairly extensive profiling of CMS (which I expect behaves
similarly to G1). We’ve found that as we increase query volume ParallelGC
tends to gradually degrade in performance, and an increasing percentage of
queries would get caught by our 2s timeout. In contrast, when we run CMS we
find that the cluster behaves pretty well under increasing load, and then
suddenly at some critical level, machines will start experiencing
multi-second stop-the-world garbage collections and the cluster would
totally falls apart. Our CMS cluster would fall apart at about 60-70% of
the traffic that our ParallelGC cluster could do while still timing out
fewer than 0.5% of requests. Some people disagree with me about this, but I
recommend you give ParallelGC a try. ParallelGC gives us more query
throughput, mostly better query latency (worse at the 99%+ quantile), and
more warning when our cluster is over-stressed and approaching failure.

That said, 600 processes each running bulk inserts could be very
aggressive. I’m not surprised that can knock over your cluster. I normally
backfill batches of 1000 1kb documents in one single-threaded process, and
I have a dynamically tunable sleep period after each insert so that I can
dial back the insert rate if my cluster starts to look like it is over
stressed. Are these live updates you’re processing? How many documents do
you need to index per minute? If this index rate is a hard constraint, then
you may need to scale your write capacity by running more shards on more
machines, or by using machines with better IO throughput (ie, SSDs).

-Jon

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #6

G1 GC is a totally different beast. See Fig. 6 in
http://blog.mgm-tp.com/2013/03/garbage-collection-tuning/

Jörg

Am 04.06.13 21:58, schrieb Jon Shea:

We haven’t profiled the G1 yet, but I have done fairly extensive
profiling of CMS (which I expect behaves similarly to G1).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Otis Gospodnetić) #7

Hi,

For what it's worth, we helped a client the other day deal with ES 0.20.4
and issues around shard recovery and performance. We switched from <I
don't recall which JVM params and collector> to G1 and "things worked
better", both with updated 17 and 21 of Oracle's Java 1.7. There were
certainly no crashes.

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, June 3, 2013 8:44:17 PM UTC-4, Christopher J. Bottaro wrote:

When I enable G1GC, Elasticsearch won't stay running for more than a few
minutes. When it dies, nothing is output in the logs. All I see is an
entry in syslog:

init: elasticsearch main process (17253) terminated with status 134

Here are my G1 settings:

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"
JAVA_OPTS="$JAVA_OPTS -XX:+UnlockExperimentalVMOptions"
JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=400"
JAVA_OPTS="$JAVA_OPTS -XX:GCPauseIntervalMillis=8000"

I've also tried 50/100 (pause/interval).

My Java version:

java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

I'm running on EC2, using a cluster of 4 m1.xlarge with 5 gb allocated to
Elasticsearch and mlockall true.

The workload is distributed bulk indexing. I have 600 processes each in a
loop, submitting bulk index requests of 200 documents at a time.

As a side note, if I use UseParNewGC and UseConcMarkSweepGC, eventually I
get stop-the-world pauses (of ~5s) every few seconds on each node. Which
renders the cluster useless.

How can I get G1 to work? Or how can I stop CMS from stopping the world
several times per minute?

Thank for the help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(shadyabhi) #8

Hi Otis,

Sorry to hop in and hijack the thread. Can you provide the exact JAVA OPTS
that you used for using G1?Thanks

On Wed, Jun 5, 2013 at 4:33 AM, Otis Gospodnetic <otis.gospodnetic@gmail.com

wrote:

Hi,

For what it's worth, we helped a client the other day deal with ES 0.20.4
and issues around shard recovery and performance. We switched from <I
don't recall which JVM params and collector> to G1 and "things worked
better", both with updated 17 and 21 of Oracle's Java 1.7. There were
certainly no crashes.

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, June 3, 2013 8:44:17 PM UTC-4, Christopher J. Bottaro wrote:

When I enable G1GC, Elasticsearch won't stay running for more than a few
minutes. When it dies, nothing is output in the logs. All I see is an
entry in syslog:

init: elasticsearch main process (17253) terminated with status 134

Here are my G1 settings:

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"
JAVA_OPTS="$JAVA_OPTS -XX:+**UnlockExperimentalVMOptions"
JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=400"
JAVA_OPTS="$JAVA_OPTS -XX:GCPauseIntervalMillis=**8000"

I've also tried 50/100 (pause/interval).

My Java version:

java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

I'm running on EC2, using a cluster of 4 m1.xlarge with 5 gb allocated to
Elasticsearch and mlockall true.

The workload is distributed bulk indexing. I have 600 processes each in
a loop, submitting bulk index requests of 200 documents at a time.

As a side note, if I use UseParNewGC and UseConcMarkSweepGC, eventually I
get stop-the-world pauses (of ~5s) every few seconds on each node. Which
renders the cluster useless.

How can I get G1 to work? Or how can I stop CMS from stopping the world
several times per minute?

Thank for the help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Otis Gospodnetić) #9

Hi,

I don't have access to that server at the moment, but I believe it was -XX:+UseG1GC
-server -Xms... -Xmx... -- i.e., nothing exotic.

Otis

Performance Monitoring - http://sematext.com/spm/index.html
Solr & ElasticSearch Support - http://sematext.com/

On Wednesday, June 5, 2013 12:00:10 PM UTC-4, Abhijeet Rastogi wrote:

Hi Otis,

Sorry to hop in and hijack the thread. Can you provide the exact JAVA OPTS
that you used for using G1?Thanks

On Wed, Jun 5, 2013 at 4:33 AM, Otis Gospodnetic <otis.gos...@gmail.com<javascript:>

wrote:

Hi,

For what it's worth, we helped a client the other day deal with ES 0.20.4
and issues around shard recovery and performance. We switched from <I
don't recall which JVM params and collector> to G1 and "things worked
better", both with updated 17 and 21 of Oracle's Java 1.7. There were
certainly no crashes.

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, June 3, 2013 8:44:17 PM UTC-4, Christopher J. Bottaro wrote:

When I enable G1GC, Elasticsearch won't stay running for more than a few
minutes. When it dies, nothing is output in the logs. All I see is an
entry in syslog:

init: elasticsearch main process (17253) terminated with status 134

Here are my G1 settings:

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"
JAVA_OPTS="$JAVA_OPTS -XX:+**UnlockExperimentalVMOptions"
JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=400"
JAVA_OPTS="$JAVA_OPTS -XX:GCPauseIntervalMillis=**8000"

I've also tried 50/100 (pause/interval).

My Java version:

java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

I'm running on EC2, using a cluster of 4 m1.xlarge with 5 gb allocated
to Elasticsearch and mlockall true.

The workload is distributed bulk indexing. I have 600 processes each in
a loop, submitting bulk index requests of 200 documents at a time.

As a side note, if I use UseParNewGC and UseConcMarkSweepGC, eventually
I get stop-the-world pauses (of ~5s) every few seconds on each node. Which
renders the cluster useless.

How can I get G1 to work? Or how can I stop CMS from stopping the world
several times per minute?

Thank for the help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Klaus Brunner) #10

FWIW, I've also run extensive load tests using G1GC (OpenJDK 7uSomething,
64 bits on Linux) a few months ago, and it worked nicely. As expected from
my earlier tests with application servers, throughput dropped a bit but
latency was a lot more stable, i.e. no long pauses. Didn't use hundreds of
threads for indexing, though, just about 50.

Klaus

On Sunday, 9 June 2013 06:46:16 UTC+2, Otis Gospodnetic wrote:

Hi,

I don't have access to that server at the moment, but I believe it was -XX:+UseG1GC
-server -Xms... -Xmx... -- i.e., nothing exotic.

Otis

Performance Monitoring - http://sematext.com/spm/index.html
Solr & ElasticSearch Support - http://sematext.com/

On Wednesday, June 5, 2013 12:00:10 PM UTC-4, Abhijeet Rastogi wrote:

Hi Otis,

Sorry to hop in and hijack the thread. Can you provide the exact JAVA
OPTS that you used for using G1?Thanks

On Wed, Jun 5, 2013 at 4:33 AM, Otis Gospodnetic otis.gos...@gmail.comwrote:

Hi,

For what it's worth, we helped a client the other day deal with ES
0.20.4 and issues around shard recovery and performance. We switched from
<I don't recall which JVM params and collector> to G1 and "things worked
better", both with updated 17 and 21 of Oracle's Java 1.7. There were
certainly no crashes.

Otis

ELASTICSEARCH Performance Monitoring -
http://sematext.com/spm/index.html

On Monday, June 3, 2013 8:44:17 PM UTC-4, Christopher J. Bottaro wrote:

When I enable G1GC, Elasticsearch won't stay running for more than a
few minutes. When it dies, nothing is output in the logs. All I see is an
entry in syslog:

init: elasticsearch main process (17253) terminated with status 134

Here are my G1 settings:

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"
JAVA_OPTS="$JAVA_OPTS -XX:+**UnlockExperimentalVMOptions"
JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=400"
JAVA_OPTS="$JAVA_OPTS -XX:GCPauseIntervalMillis=**8000"

I've also tried 50/100 (pause/interval).

My Java version:

java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

I'm running on EC2, using a cluster of 4 m1.xlarge with 5 gb allocated
to Elasticsearch and mlockall true.

The workload is distributed bulk indexing. I have 600 processes each
in a loop, submitting bulk index requests of 200 documents at a time.

As a side note, if I use UseParNewGC and UseConcMarkSweepGC, eventually
I get stop-the-world pauses (of ~5s) every few seconds on each node. Which
renders the cluster useless.

How can I get G1 to work? Or how can I stop CMS from stopping the
world several times per minute?

Thank for the help.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Regards,
Abhijeet Rastogi (shadyabhi)
http://blog.abhijeetr.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Dan Everton) #11

On Sunday, June 9, 2013 2:46:16 PM UTC+10, Otis Gospodnetic wrote:

I don't have access to that server at the moment, but I believe it was -XX:+UseG1GC
-server -Xms... -Xmx... -- i.e., nothing exotic.

What heap size? We're trying to use G1GC but a few seconds after startup,
the JVM process running the Elasticsearch instance crashes. The only clue
is that it's crashing at

J

org.elasticsearch.common.trove.impl.hash.TObjectHash.insertKey(Ljava/lang/Object;)I

But that's not all that helpful. We're on the latest JDK 7 (update 25 I
believe) from Oracle and with a Xms/Xmx of 16 GB on a 12-core AMD machine.

Searching for bug reports doesn't turn up much apart from this


which indicates the problem has been around for a long time and isn't ES
specific.

Cheers,

Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Runar Myklebust) #12

Ive experienced the same behaviour in our embedded ES installations, with
the same results, and it seems that trove is the reason. Doesnt really help
us and our customers though, I would hope that the ES team would address
this, or at least make an official statement that G1GC is not supported.

On Mon, Jul 22, 2013 at 7:44 AM, Dan Everton dan@iocaine.org wrote:

On Sunday, June 9, 2013 2:46:16 PM UTC+10, Otis Gospodnetic wrote:

I don't have access to that server at the moment, but I believe it was -XX:+UseG1GC
-server -Xms... -Xmx... -- i.e., nothing exotic.

What heap size? We're trying to use G1GC but a few seconds after startup,
the JVM process running the Elasticsearch instance crashes. The only clue
is that it's crashing at

J

org.elasticsearch.common.trove.impl.hash.TObjectHash.insertKey(Ljava/lang/Object;)I

But that's not all that helpful. We're on the latest JDK 7 (update 25 I
believe) from Oracle and with a Xms/Xmx of 16 GB on a 12-core AMD machine.

Searching for bug reports doesn't turn up much apart from this
http://stackoverflow.com/questions/11293384/jvm-crash-with-g1-gc-and-trove-librarywhich indicates the problem has been around for a long time and isn't ES
specific.

Cheers,

Dan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #13

My extensive load tests with ES index and query under Java with G1GC are
passing succesfully, both Java 7 and Java 8.

What you found is a bug triggered by trove, not related to Elasticsearch.
Nevertheless I would be glad to help and I wonder what trove jar was used.
From
https://bitbucket.org/robeden/trove/src/c59d0ed735f0824cca851743e070d065509cf219/pom.xml?at=masterI
conclude the Maven artifact is in Java 5 class format. Probably a new
trove build or some trove fixes could help to get it run with G1GC.

Jörg

On Mon, Jul 22, 2013 at 11:01 AM, Runar Myklebust runar.a.m@gmail.comwrote:

Ive experienced the same behaviour in our embedded ES installations, with
the same results, and it seems that trove is the reason. Doesnt really help
us and our customers though, I would hope that the ES team would address
this, or at least make an official statement that G1GC is not supported.

On Mon, Jul 22, 2013 at 7:44 AM, Dan Everton dan@iocaine.org wrote:

On Sunday, June 9, 2013 2:46:16 PM UTC+10, Otis Gospodnetic wrote:

I don't have access to that server at the moment, but I believe it was -XX:+UseG1GC
-server -Xms... -Xmx... -- i.e., nothing exotic.

What heap size? We're trying to use G1GC but a few seconds after startup,
the JVM process running the Elasticsearch instance crashes. The only clue
is that it's crashing at

J

org.elasticsearch.common.trove.impl.hash.TObjectHash.insertKey(Ljava/lang/Object;)I

But that's not all that helpful. We're on the latest JDK 7 (update 25 I
believe) from Oracle and with a Xms/Xmx of 16 GB on a 12-core AMD machine.

Searching for bug reports doesn't turn up much apart from this
http://stackoverflow.com/questions/11293384/jvm-crash-with-g1-gc-and-trove-librarywhich indicates the problem has been around for a long time and isn't ES
specific.

Cheers,

Dan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #14

On Maven repo, I found this gnu/trove/Version.class: compiled Java class
data, version 49.0 (Java 1.5) built with 1.6.0_31-b04-415-11M3646 (Apple
Inc.)

I have copied over the source for a Java 1.6 (class file format 50)
mavenized build of trove here (using Java 1.7.0_21):

https://github.com/xbib/trove

If you want to try out if this binary works better with ES and G1GC, you
may include trove jar from

<repository>
    <id>github-release-repo</id>
    <url>https://github.com/xbib/maven-repo/raw/master/releases</url>
</repository>

<dependency>

net.sf.trove4j
trove4j
3.0.3

or by manual download

https://github.com/xbib/maven-repo/tree/master/releases/net/sf/trove4j/trove4j/3.0.3

Best,

Jörg

On Mon, Jul 22, 2013 at 11:35 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

My extensive load tests with ES index and query under Java with G1GC are
passing succesfully, both Java 7 and Java 8.

What you found is a bug triggered by trove, not related to Elasticsearch.
Nevertheless I would be glad to help and I wonder what trove jar was used.
From
https://bitbucket.org/robeden/trove/src/c59d0ed735f0824cca851743e070d065509cf219/pom.xml?at=masterI conclude the Maven artifact is in Java 5 class format. Probably a new
trove build or some trove fixes could help to get it run with G1GC.

Jörg

On Mon, Jul 22, 2013 at 11:01 AM, Runar Myklebust runar.a.m@gmail.comwrote:

Ive experienced the same behaviour in our embedded ES installations, with
the same results, and it seems that trove is the reason. Doesnt really help
us and our customers though, I would hope that the ES team would address
this, or at least make an official statement that G1GC is not supported.

On Mon, Jul 22, 2013 at 7:44 AM, Dan Everton dan@iocaine.org wrote:

On Sunday, June 9, 2013 2:46:16 PM UTC+10, Otis Gospodnetic wrote:

I don't have access to that server at the moment, but I believe it was -XX:+UseG1GC
-server -Xms... -Xmx... -- i.e., nothing exotic.

What heap size? We're trying to use G1GC but a few seconds after
startup, the JVM process running the Elasticsearch instance crashes. The
only clue is that it's crashing at

J

org.elasticsearch.common.trove.impl.hash.TObjectHash.insertKey(Ljava/lang/Object;)I

But that's not all that helpful. We're on the latest JDK 7 (update 25 I
believe) from Oracle and with a Xms/Xmx of 16 GB on a 12-core AMD machine.

Searching for bug reports doesn't turn up much apart from this
http://stackoverflow.com/questions/11293384/jvm-crash-with-g1-gc-and-trove-librarywhich indicates the problem has been around for a long time and isn't ES
specific.

Cheers,

Dan

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
mvh

Runar Myklebust

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Dan Everton) #15

On Tuesday, July 23, 2013 7:52:49 AM UTC+10, Jörg Prante wrote:

On Maven repo, I found this gnu/trove/Version.class: compiled Java class
data, version 49.0 (Java 1.5) built with 1.6.0_31-b04-415-11M3646 (Apple
Inc.)

I have copied over the source for a Java 1.6 (class file format 50)
mavenized build of trove here (using Java 1.7.0_21):

I wouldn't expect recompiling the JAR with Java 7 would make any difference
but I'll see if we can get it tested.

Switching to mmapfs from niofs helped a little bit with G1GC enabled. The
nodes survive for a few minutes instead of dying immediately. There's no
load on them during this time so I've no idea what the cause might be. But
at this point it looks like G1GC is completely unusable for us.

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #16

I could reproduce the issue, both on JDK 7 and 8. Trying to build a
reproducible test case.

Jörg

On Wed, Jul 24, 2013 at 1:22 AM, Dan Everton dan@iocaine.org wrote:

On Tuesday, July 23, 2013 7:52:49 AM UTC+10, Jörg Prante wrote:

On Maven repo, I found this gnu/trove/Version.class: compiled Java class
data, version 49.0 (Java 1.5) built with 1.6.0_31-b04-415-11M3646 (Apple
Inc.)

I have copied over the source for a Java 1.6 (class file format 50)
mavenized build of trove here (using Java 1.7.0_21):

I wouldn't expect recompiling the JAR with Java 7 would make any
difference but I'll see if we can get it tested.

Switching to mmapfs from niofs helped a little bit with G1GC enabled. The
nodes survive for a few minutes instead of dying immediately. There's no
load on them during this time so I've no idea what the cause might be. But
at this point it looks like G1GC is completely unusable for us.

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Saiprasad Mishra) #17

Any update on exact gc params that work with elasticsearch on G1. I am
having the similar issue on ES 0.90.0 with G1

Thanks for your time on this
Cheers
Sai

On Wednesday, July 24, 2013 1:56:41 AM UTC-7, Jörg Prante wrote:

I could reproduce the issue, both on JDK 7 and 8. Trying to build a
reproducible test case.

Jörg

On Wed, Jul 24, 2013 at 1:22 AM, Dan Everton <d...@iocaine.org<javascript:>

wrote:

On Tuesday, July 23, 2013 7:52:49 AM UTC+10, Jörg Prante wrote:

On Maven repo, I found this gnu/trove/Version.class: compiled Java class
data, version 49.0 (Java 1.5) built with 1.6.0_31-b04-415-11M3646 (Apple
Inc.)

I have copied over the source for a Java 1.6 (class file format 50)
mavenized build of trove here (using Java 1.7.0_21):

I wouldn't expect recompiling the JAR with Java 7 would make any
difference but I'll see if we can get it tested.

Switching to mmapfs from niofs helped a little bit with G1GC enabled. The
nodes survive for a few minutes instead of dying immediately. There's no
load on them during this time so I've no idea what the cause might be. But
at this point it looks like G1GC is completely unusable for us.

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8705c21e-62da-4247-803c-d0a3c8cdcfc4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #18

The primary issue with G1 is the Trove library. Elasticsearch no longer
uses Trove with versions 0.90.6 and higher, so I would suggest upgrading if
you plan on using G1GC.

--
Ivan

On Wed, Feb 5, 2014 at 11:19 AM, saiprasad mishra <saiprasadmishra@gmail.com

wrote:

Any update on exact gc params that work with elasticsearch on G1. I am
having the similar issue on ES 0.90.0 with G1

Thanks for your time on this
Cheers
Sai

On Wednesday, July 24, 2013 1:56:41 AM UTC-7, Jörg Prante wrote:

I could reproduce the issue, both on JDK 7 and 8. Trying to build a
reproducible test case.

Jörg

On Wed, Jul 24, 2013 at 1:22 AM, Dan Everton d...@iocaine.org wrote:

On Tuesday, July 23, 2013 7:52:49 AM UTC+10, Jörg Prante wrote:

On Maven repo, I found this gnu/trove/Version.class: compiled Java
class data, version 49.0 (Java 1.5) built with 1.6.0_31-b04-415-11M3646
(Apple Inc.)

I have copied over the source for a Java 1.6 (class file format 50)
mavenized build of trove here (using Java 1.7.0_21):

I wouldn't expect recompiling the JAR with Java 7 would make any
difference but I'll see if we can get it tested.

Switching to mmapfs from niofs helped a little bit with G1GC enabled.
The nodes survive for a few minutes instead of dying immediately. There's
no load on them during this time so I've no idea what the cause might be.
But at this point it looks like G1GC is completely unusable for us.

Cheers,
Dan

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8705c21e-62da-4247-803c-d0a3c8cdcfc4%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD92eLQTmXnTqr4xURKHn9Vrsv2aRmYHcekELzXAALRKg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Saiprasad Mishra) #19

Thanks Ivan for very quick help here....
Planning to upgrade , this is one more reason to consider
Will keep posted how it goes

Cheers
Sai

On Wednesday, February 5, 2014 11:35:50 AM UTC-8, Ivan Brusic wrote:

The primary issue with G1 is the Trove library. Elasticsearch no longer
uses Trove with versions 0.90.6 and higher, so I would suggest upgrading if
you plan on using G1GC.

--
Ivan

On Wed, Feb 5, 2014 at 11:19 AM, saiprasad mishra <saipras...@gmail.com<javascript:>

wrote:

Any update on exact gc params that work with elasticsearch on G1. I am
having the similar issue on ES 0.90.0 with G1

Thanks for your time on this
Cheers
Sai

On Wednesday, July 24, 2013 1:56:41 AM UTC-7, Jörg Prante wrote:

I could reproduce the issue, both on JDK 7 and 8. Trying to build a
reproducible test case.

Jörg

On Wed, Jul 24, 2013 at 1:22 AM, Dan Everton d...@iocaine.org wrote:

On Tuesday, July 23, 2013 7:52:49 AM UTC+10, Jörg Prante wrote:

On Maven repo, I found this gnu/trove/Version.class: compiled Java
class data, version 49.0 (Java 1.5) built with 1.6.0_31-b04-415-11M3646
(Apple Inc.)

I have copied over the source for a Java 1.6 (class file format 50)
mavenized build of trove here (using Java 1.7.0_21):

I wouldn't expect recompiling the JAR with Java 7 would make any
difference but I'll see if we can get it tested.

Switching to mmapfs from niofs helped a little bit with G1GC enabled.
The nodes survive for a few minutes instead of dying immediately. There's
no load on them during this time so I've no idea what the cause might be.
But at this point it looks like G1GC is completely unusable for us.

Cheers,
Dan

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8705c21e-62da-4247-803c-d0a3c8cdcfc4%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4ae2c332-ce53-45af-952e-87b8bef9d60e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #20