Garbage collection log messages, [monitor.jvm ... duration [2.2m]


(Wouter van Atteveldt) #1

I am using elasticsearch to index and query a fairly large document
collection. Most of the data is in a single property "text" of a doctype
"article". The index is sometimes slow, and my log has many messages about
the garbage collection:

For example, the following is right after starting the elasticsearch
process:

[2014-02-07 16:11:36,681][WARN ][monitor.jvm ] [Warwolves]
[gc][young][30][12] duration [1.1m], collections [11]/[3.1m], total
[1.1m]/[1.1m], memory [485.9mb]->[1.9gb]/[15.9gb], all_pools {[young]
[459.7mb]->[442.3mb]/[599mb]}{[survivor] [26.1mb]->[74.8mb]/[74.8mb]}{[old]
[0b]->[1.4gb]/[15.2gb]}
[2014-02-07 16:11:47,451][WARN ][monitor.jvm ] [Warwolves]
[gc][young][34][13] duration [7.4s], collections [1]/[7.7s], total
[7.4s]/[1.2m], memory [2gb]->[1.6gb]/[15.9gb], all_pools {[young]
[594.1mb]->[8.9mb]/[599mb]}{[survivor] [74.8mb]->[74.8mb]/[74.8mb]}{[old]
[1.4gb]->[1.5gb]/[15.2gb]}
[2014-02-07 16:12:06,311][WARN ][monitor.jvm ] [Warwolves]
[gc][young][41][15] duration [3.3s], collections [1]/[3.4s], total
[3.3s]/[1.3m], memory [2.3gb]->[1.9gb]/[15.9gb], all_pools {[young]
[562.1mb]->[8.5mb]/[599mb]}{[survivor] [74.8mb]->[74.8mb]/[74.8mb]}{[old]
[1.7gb]->[1.8gb]/[15.2gb]}
[2014-02-07 16:16:52,440][WARN ][monitor.jvm ] [Warwolves]
[gc][young][42][33] duration [2.2m], collections [18]/[4.7m], total
[2.2m]/[3.5m], memory [1.9gb]->[4.1gb]/[15.9gb], all_pools {[young]
[8.5mb]->[72.5mb]/[599mb]}{[survivor] [74.8mb]->[74.8mb]/[74.8mb]}{[old]
[1.8gb]->[4gb]/[15.2gb]}

IIUC, the last gc took 2.2 minutes, which indeed feels quite long?

Index size as reported by head:
size: 107G (107G)
docs: 44,832,514 (51,560,620)

I start elastic using ES_HEAP_SIZE=16g and (reundantly?) with arguments
-Xms2G -Xmx16G. The machine is a virtual guest with 48GB memory, and
elastic is running alongside an nginx+uwsgi+django stack.

Except for some logging thresholds, config is unchanged from download. I am
using version 0.90.10

Are the gc messages indicative of a problem? Should I change the
configuration?

Thanks,

-- Wouter

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bce5f34d-e1af-4059-8e65-b5302ee13ce0%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

Wouter,

Yes it is possible that you have memory pressure. I'd probably:

  1. Set bootstrap.mlockall: true in the elasticsearch.yml file

  2. Once you're up and running (or when these GC pauses start to happen),
    check the node stats to see what you have in memory:

curl "http://localhost:9200/_nodes/stats/jvm?pretty"

That will give you a rough idea if you might need to bump that ES_HEAP_SIZE
up some more (up to 1/2 of your available RAM or 30GB whichever is smaller).

  1. If you're reaching the limits of RAM on a single node, then it might be
    time to add more nodes to distribute those shards out horizontally.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2d013066-4b85-422a-a113-728280a18e4a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Wouter van Atteveldt) #3

Dear Binh Ly,

Thanks for your reply and sorry for not responding earlier. We've moved
over our elasticsearch to SSD and I had hoped that that might help with the
performance issues, but no luck.

It seems that whenever elastic is freshly started it performs pretty well,
but after a couple days it just becomes really slow and seems to be having
memory issues.

On Friday, February 7, 2014 6:43:08 PM UTC+1, Binh Ly wrote:
Yes it is possible that you have memory pressure. I'd probably:

  1. Set bootstrap.mlockall: true in the elasticsearch.yml file
  2. Once you're up and running (or when these GC pauses start to happen),
    check the node stats to see what you have in memory:
    curl "http://localhost:9200/_nodes/stats/jvm?pretty"
    That will give you a rough idea if you might need to bump that
    ES_HEAP_SIZE up some more (up to 1/2 of your available RAM or 30GB
    whichever is smaller).

The server has 40G heap size (increased from 30) on a virtual machine with
56G in total, and mlockall is true. The machine is not swapping (it only
runs elastic and nginx/uwsgi). We are still using 0.9.10 on the production
server, I can switch that over to 1.x to see if it helps.

stats.json and the relevant config and log files are posted at

  1. If you're reaching the limits of RAM on a single node, then it might
    be time to add more nodes to distribute those shards out horizontally.

Yeah I guess that would be the ultimate remedy, but I don't really have
budget at the moment to add servers.

Thanks for any help,

Wouter

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4fdf044b-4668-4b0e-8f53-ae86dbdd846b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #4

This is not Elasticsearch related. If you use a 40g heap of such extreme
size, you must expect that garbage collection must run for minutes, on
every JVM I know.

Jörg

On Tue, Apr 15, 2014 at 11:35 AM, Wouter van Atteveldt <
vanatteveldt@gmail.com> wrote:

Dear Binh Ly,

Thanks for your reply and sorry for not responding earlier. We've moved
over our elasticsearch to SSD and I had hoped that that might help with the
performance issues, but no luck.

It seems that whenever elastic is freshly started it performs pretty well,
but after a couple days it just becomes really slow and seems to be having
memory issues.

On Friday, February 7, 2014 6:43:08 PM UTC+1, Binh Ly wrote:
Yes it is possible that you have memory pressure. I'd probably:

  1. Set bootstrap.mlockall: true in the elasticsearch.yml file
  2. Once you're up and running (or when these GC pauses start to happen),
    check the node stats to see what you have in memory:
    curl "http://localhost:9200/_nodes/stats/jvm?pretty"
    That will give you a rough idea if you might need to bump that
    ES_HEAP_SIZE up some more (up to 1/2 of your available RAM or 30GB
    whichever is smaller).

The server has 40G heap size (increased from 30) on a virtual machine with
56G in total, and mlockall is true. The machine is not swapping (it only
runs elastic and nginx/uwsgi). We are still using 0.9.10 on the production
server, I can switch that over to 1.x to see if it helps.

stats.json and the relevant config and log files are posted at
https://gist.github.com/vanatteveldt/10717100

  1. If you're reaching the limits of RAM on a single node, then it might
    be time to add more nodes to distribute those shards out horizontally.

Yeah I guess that would be the ultimate remedy, but I don't really have
budget at the moment to add servers.

Thanks for any help,

Wouter

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/4fdf044b-4668-4b0e-8f53-ae86dbdd846b%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/4fdf044b-4668-4b0e-8f53-ae86dbdd846b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFibBNzZoXfquAQobg-E2xVONsH_Tan%3DCeCHw-zspVUiw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Wouter van Atteveldt-2) #5

On Tue, Apr 15, 2014 at 2:00 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is not Elasticsearch related. If you use a 40g heap of such extreme
size, you must expect that garbage collection must run for minutes, on
every JVM I know.

Right, but it is actually advised to give elastic a lot of heap, right? The
whole index is around 140G, so I would have thought that all frequently
used parts should get loaded in memory, but it still starts running slow
after a while.

Any ideas?

-- Wouter

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACXi6Xe_Q3xy8xZNo5QDqPpsJot8AknCYXfo5N9Vw2AyOfbVZA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Nik Everett) #6

On Tue, Apr 15, 2014 at 9:42 AM, Wouter van Atteveldt <
wouter@vanatteveldt.com> wrote:

On Tue, Apr 15, 2014 at 2:00 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is not Elasticsearch related. If you use a 40g heap of such extreme
size, you must expect that garbage collection must run for minutes, on
every JVM I know.

Right, but it is actually advised to give elastic a lot of heap, right?
The whole index is around 140G, so I would have thought that all frequently
used parts should get loaded in memory, but it still starts running slow
after a while.

Any ideas?

Go with 30GB. 30GB is magic because much over that and the JVM can't do
pointer compression so there is a hole in how effective heap is. You can
learn more by clicking links in this:

Beyond that, you may want to look at what is actually happening when
collections are done. This article is about Cassandra but it seems pretty
on the ball:
http://tech.shift.com/post/74311817513/cassandra-tuning-the-jvm-for-read-heavy-workloads

Beyond that, scale out.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd2ZUBdwM7m09A0BBZ-ugaJDLxYLqXcH6RoMVJYRJFQhLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #7

The advisory of "a lot of heap" means, give as much heap as the JVM is able
to process efficiently. There is an upper limit due to JVM engineering
state of today. You will not find JVMs that can efficiently manage heaps

32G (except rare expensive commercial JVM products). By efficient I mean
GC stalls under a second. There is heavy engineering going on, known as the
Shenandoah project, to tackle heaps over 100G by millisecond GC:
http://openjdk.java.net/jeps/189

The mere index size is not related to heap size choice. You need large heap
if you want filter caching and aggregations/facets cached.

Example: I have 350G on index files. On my 3x64G RAM nodes I have assigned
3x16G heap and I do not cache filters, due to the nature of my queries. The
other ~48G I left to OS, for file system buffers (direct I/O is the key to
fast systems). If I assigned 32G to heap, GC would be inacceptable high,
and system would go sluggish after some days, as you had described. It is
not a matter of heap size, but of balancing things carefully out between
JVM management abilities and operating system I/O power. The challenge is
that many ES workload patterns require different balancings.

Jörg

On Tue, Apr 15, 2014 at 3:42 PM, Wouter van Atteveldt <
wouter@vanatteveldt.com> wrote:

On Tue, Apr 15, 2014 at 2:00 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is not Elasticsearch related. If you use a 40g heap of such extreme
size, you must expect that garbage collection must run for minutes, on
every JVM I know.

Right, but it is actually advised to give elastic a lot of heap, right?
The whole index is around 140G, so I would have thought that all frequently
used parts should get loaded in memory, but it still starts running slow
after a while.

Any ideas?

-- Wouter

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACXi6Xe_Q3xy8xZNo5QDqPpsJot8AknCYXfo5N9Vw2AyOfbVZA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CACXi6Xe_Q3xy8xZNo5QDqPpsJot8AknCYXfo5N9Vw2AyOfbVZA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERmpqOqp%2Bzq3jMaCZ7e2OrXuboVP9W6BFU%3DDS51RL%3DPw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Wouter van Atteveldt-2) #8

Thanks for the explanation, that really helps.

Does that mean that on a virtual host with 64GB memory it might make sense
to make two virtual servers each running a node? I had expected that
multiple nodes on a single host would not help, but I guess if the VM is
the limitation it might?

I have a read-heavy workload, with good use of facets/aggregations and also
some really complex queries (>1000 terms), but most of them limited to
subsets of <10k or 100k documents (out of 50M). Any recommendations would
be much appreciated!

-- Wouter

On Tue, Apr 15, 2014 at 4:40 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

The advisory of "a lot of heap" means, give as much heap as the JVM is
able to process efficiently. There is an upper limit due to JVM engineering
state of today. You will not find JVMs that can efficiently manage heaps

32G (except rare expensive commercial JVM products). By efficient I mean
GC stalls under a second. There is heavy engineering going on, known as the
Shenandoah project, to tackle heaps over 100G by millisecond GC:
http://openjdk.java.net/jeps/189

The mere index size is not related to heap size choice. You need large
heap if you want filter caching and aggregations/facets cached.

Example: I have 350G on index files. On my 3x64G RAM nodes I have assigned
3x16G heap and I do not cache filters, due to the nature of my queries. The
other ~48G I left to OS, for file system buffers (direct I/O is the key to
fast systems). If I assigned 32G to heap, GC would be inacceptable high,
and system would go sluggish after some days, as you had described. It is
not a matter of heap size, but of balancing things carefully out between
JVM management abilities and operating system I/O power. The challenge is
that many ES workload patterns require different balancings.

Jörg

On Tue, Apr 15, 2014 at 3:42 PM, Wouter van Atteveldt <
wouter@vanatteveldt.com> wrote:

On Tue, Apr 15, 2014 at 2:00 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

This is not Elasticsearch related. If you use a 40g heap of such extreme
size, you must expect that garbage collection must run for minutes, on
every JVM I know.

Right, but it is actually advised to give elastic a lot of heap, right?
The whole index is around 140G, so I would have thought that all frequently
used parts should get loaded in memory, but it still starts running slow
after a while.

Any ideas?

-- Wouter

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CACXi6Xe_Q3xy8xZNo5QDqPpsJot8AknCYXfo5N9Vw2AyOfbVZA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CACXi6Xe_Q3xy8xZNo5QDqPpsJot8AknCYXfo5N9Vw2AyOfbVZA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/yfQv5sDuF40/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoERmpqOqp%2Bzq3jMaCZ7e2OrXuboVP9W6BFU%3DDS51RL%3DPw%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CAKdsXoERmpqOqp%2Bzq3jMaCZ7e2OrXuboVP9W6BFU%3DDS51RL%3DPw%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CACXi6XcTY4P4AG7UarPfrBjx24%3DhnbcmouEGGfHz1WLHXT5b3w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #9