High load average running on ES node


(Arjit Gupta) #1

Hi,

I have 4 node cluster for 32 Gb ram and 8 core processor. With 5 indexes
and 5 primary shrads and 2 replica.
I am using elastic search version 0.90.1
I have a lot of read/writes/deletes. Most of time the load average is one
of the node goes to 70-80 for other it comes to 10 on a high load.
I attached jconsole sharing the screenshorts. I see a lot of gc cycles
happening
I was going
through http://jprante.github.io/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html
It says to avoid "Avoiding Stop-the-world phases" adjust index.merge.policy.segments_per_tier.
I am using default values as of now for merge.

Jstack of the ES node on high load

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d5d2c8e5-5eab-4af9-bd06-951c8786ddaa%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(vaidik) #2

I have seen similar problems. I am no Elasticsearch expert yet. But I'd
suggest trying G1 Garbage Collector as well instead of the default CMS
Garbage Collector. From what I know, CMS was never made for such large JVM
heaps. G1 works better with large JVM heaps, runs more frequently instead
of after a long time to keep the GC run shorter, leading to avoiding
stop-the-world pauses.

I am not sure if this will help. But you can give it a try.

Vaidik Kapoor
vaidikkapoor.info

On 5 January 2014 10:19, Arjit Gupta arjit292@gmail.com wrote:

Hi,

I have 4 node cluster for 32 Gb ram and 8 core processor. With 5 indexes
and 5 primary shrads and 2 replica.
I am using elastic search version 0.90.1
I have a lot of read/writes/deletes. Most of time the load average is one
of the node goes to 70-80 for other it comes to 10 on a high load.
I attached jconsole sharing the screenshorts. I see a lot of gc cycles
happening
I was going through
http://jprante.github.io/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html
It says to avoid "Avoiding Stop-the-world phases" adjust index.merge.policy.segments_per_tier.
I am using default values as of now for merge.

Jstack of the ES node on high load
https://gist.github.com/arjitgupta/8264462

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d5d2c8e5-5eab-4af9-bd06-951c8786ddaa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACWtv5kAXL33%3DKChLsRfZ9TMRONwsX1bGD9KyEnOwFroaNKG9g%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Arjit Gupta) #3

I really dont know if G1 is production ready on Java6 . Are you using it on
java 6 ?

Java version on my servers :
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

Thanks ,
Arjit

On Sun, Jan 5, 2014 at 12:04 PM, Vaidik Kapoor kapoor.vaidik@gmail.comwrote:

I have seen similar problems. I am no Elasticsearch expert yet. But I'd
suggest trying G1 Garbage Collector as well instead of the default CMS
Garbage Collector. From what I know, CMS was never made for such large JVM
heaps. G1 works better with large JVM heaps, runs more frequently instead
of after a long time to keep the GC run shorter, leading to avoiding
stop-the-world pauses.

I am not sure if this will help. But you can give it a try.

Vaidik Kapoor
vaidikkapoor.info

On 5 January 2014 10:19, Arjit Gupta arjit292@gmail.com wrote:

Hi,

I have 4 node cluster for 32 Gb ram and 8 core processor. With 5 indexes
and 5 primary shrads and 2 replica.
I am using elastic search version 0.90.1
I have a lot of read/writes/deletes. Most of time the load average is one
of the node goes to 70-80 for other it comes to 10 on a high load.
I attached jconsole sharing the screenshorts. I see a lot of gc cycles
happening
I was going through
http://jprante.github.io/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html
It says to avoid "Avoiding Stop-the-world phases" adjust index.merge.policy.segments_per_tier.
I am using default values as of now for merge.

Jstack of the ES node on high load
https://gist.github.com/arjitgupta/8264462

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d5d2c8e5-5eab-4af9-bd06-951c8786ddaa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/taLTdd4S29w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACWtv5kAXL33%3DKChLsRfZ9TMRONwsX1bGD9KyEnOwFroaNKG9g%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd8oAyEt-y_xf0AQkvQo%3DWoHcxb-c6b%3DWeM4D8qtyupJqg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(vaidik) #4

No I'm using it with Java 7. But anyways it is recommended to use Java 7.
Is it not possible for you to move to Java 7?
On Jan 5, 2014 1:05 PM, "Arjit Gupta" arjit292@gmail.com wrote:

I really dont know if G1 is production ready on Java6 . Are you using it
on java 6 ?

Java version on my servers :
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)

Thanks ,
Arjit

On Sun, Jan 5, 2014 at 12:04 PM, Vaidik Kapoor kapoor.vaidik@gmail.comwrote:

I have seen similar problems. I am no Elasticsearch expert yet. But I'd
suggest trying G1 Garbage Collector as well instead of the default CMS
Garbage Collector. From what I know, CMS was never made for such large JVM
heaps. G1 works better with large JVM heaps, runs more frequently instead
of after a long time to keep the GC run shorter, leading to avoiding
stop-the-world pauses.

I am not sure if this will help. But you can give it a try.

Vaidik Kapoor
vaidikkapoor.info

On 5 January 2014 10:19, Arjit Gupta arjit292@gmail.com wrote:

Hi,

I have 4 node cluster for 32 Gb ram and 8 core processor. With 5 indexes
and 5 primary shrads and 2 replica.
I am using elastic search version 0.90.1
I have a lot of read/writes/deletes. Most of time the load average is
one of the node goes to 70-80 for other it comes to 10 on a high load.
I attached jconsole sharing the screenshorts. I see a lot of gc cycles
happening
I was going through
http://jprante.github.io/2012/11/28/Elasticsearch-Java-Virtual-Machine-settings-explained.html
It says to avoid "Avoiding Stop-the-world phases" adjust index.merge.policy.segments_per_tier.
I am using default values as of now for merge.

Jstack of the ES node on high load
https://gist.github.com/arjitgupta/8264462

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d5d2c8e5-5eab-4af9-bd06-951c8786ddaa%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/taLTdd4S29w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACWtv5kAXL33%3DKChLsRfZ9TMRONwsX1bGD9KyEnOwFroaNKG9g%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd8oAyEt-y_xf0AQkvQo%3DWoHcxb-c6b%3DWeM4D8qtyupJqg%40mail.gmail.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACWtv5np8eUckt4cG67EnwaKN0VfUcpTNT4_q4qKq0HN%2BVFvVw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Jörg Prante) #5

The load is not much a surprise for an 8 core CPU node, I have also
observed loads of 80-100.

This high load, when induced by indexing, can be significantly reduced when
using a high performance input/output disk subsystem, such as SSD. The
disks are the slowest part in the system and generate high I/O wait which
is responsible for increasing the CPU load.

GC does generate high load too, this is mostly related to expensive queries
that use filters or caches. The overall performance of the JVM is getting
very poor in that case.

You have several options:

  • rewriting queries or reconfiguring ES for efficient cache usage
  • adding nodes
  • decrease the heap slightly to smooth the steep edge when stop-the-world
    GC kicks in (but this depends on the workload if your ES cluster can work
    with less heap)

G1 GC does not help against query/filter load, it is not decreasing CPU
load, in fact, it is putting more CPU load on the machines, so it can
better make a trade-off with less stop-of-the-world. G1 GC helps to push
the stop-the-world periods under a certain limit so ES nodes do not
disconnect that easily. It has no steep edge when performing stop-the-world
GC phases.

Please note, currently G1 GC seems safe only with Java 7 or Java 8 and ES
version that have replaced GNU trove4j with HPPC library, that is, 0.90.9
or 1.0.0.Beta2

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFnK4SweOfQK1b0yg9M4yCBJVVmGpd%3DcJpHSwqLC9Cjxw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Arjit Gupta) #6

Hi Jörg,

Thanks a lot for your detailed reply.
Can you please explain how can I reconfiguring ES for efficient cache
usage
?

Thanks,
Arjit

Thanks ,
Arjit

On Sun, Jan 5, 2014 at 10:55 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

The load is not much a surprise for an 8 core CPU node, I have also
observed loads of 80-100.

This high load, when induced by indexing, can be significantly reduced
when using a high performance input/output disk subsystem, such as SSD. The
disks are the slowest part in the system and generate high I/O wait which
is responsible for increasing the CPU load.

GC does generate high load too, this is mostly related to expensive
queries that use filters or caches. The overall performance of the JVM is
getting very poor in that case.

You have several options:

  • rewriting queries or reconfiguring ES for efficient cache usage
  • adding nodes
  • decrease the heap slightly to smooth the steep edge when stop-the-world
    GC kicks in (but this depends on the workload if your ES cluster can work
    with less heap)

G1 GC does not help against query/filter load, it is not decreasing CPU
load, in fact, it is putting more CPU load on the machines, so it can
better make a trade-off with less stop-of-the-world. G1 GC helps to push
the stop-the-world periods under a certain limit so ES nodes do not
disconnect that easily. It has no steep edge when performing stop-the-world
GC phases.

Please note, currently G1 GC seems safe only with Java 7 or Java 8 and ES
version that have replaced GNU trove4j with HPPC library, that is, 0.90.9
or 1.0.0.Beta2

Jörg

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/taLTdd4S29w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFnK4SweOfQK1b0yg9M4yCBJVVmGpd%3DcJpHSwqLC9Cjxw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CADe%2BHd_1c2OpJoHdH8WGa0q0F-FR4haC71FrmPGgk5Oz1RwDPg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #7