Preventing stop-of-the-world garbage collection

Michal_Taborsky · December 29, 2014, 9:36am

Hello everyone,

we are using ES as a backend of an online service and occasionally, we are
hit by a big garbage collection, which stops the node completely and causes
all sorts of problems. The nodes have plenty of memory I think. During the
GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections
[3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy indexing,
sometimes it doesn't. We tried decresing the heap size, but it does not
have that much of an effect. It makes the GC take a bit less time, but
makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but very
complex documents and queries. This is a 5-node cluster, the nodes have 32G
RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I feel
we might have to. Does anyone have any idea how to prevent these garbage
collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · December 29, 2014, 11:06am

You could use G1 GC for nicer behavior regarding application stop times,
but before tinkering with GC, it would be better to check if you have set
up caching, and if it is possible to clear caches or reconfigure ES.

Jörg

On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky <michal.taborsky@gmail.com

wrote:

Hello everyone,

we are using ES as a backend of an online service and occasionally, we are
hit by a big garbage collection, which stops the node completely and causes
all sorts of problems. The nodes have plenty of memory I think. During the
GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections
[3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy indexing,
sometimes it doesn't. We tried decresing the heap size, but it does not
have that much of an effect. It makes the GC take a bit less time, but
makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but very
complex documents and queries. This is a 5-node cluster, the nodes have 32G
RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I feel
we might have to. Does anyone have any idea how to prevent these garbage
collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcU_F7qoCyuuiA1uEBx96xFzqEpiQrZWbdhoMyxVsEEA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Michal_Taborsky · December 29, 2014, 12:55pm

Hi Jörg, thanks for your reply.

What do you mean if we have setup caching? We do not have any special
caching configuration, we use the defaults. How do you suggest we
reconfigure ES? That is what I am trying to find out.

All best,
Michal

Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):

You could use G1 GC for nicer behavior regarding application stop times,
but before tinkering with GC, it would be better to check if you have set
up caching, and if it is possible to clear caches or reconfigure ES.

Jörg

On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky <michal....@gmail.com
<javascript:>> wrote:

Hello everyone,

we are using ES as a backend of an online service and occasionally, we
are hit by a big garbage collection, which stops the node completely and
causes all sorts of problems. The nodes have plenty of memory I think.
During the GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections
[3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy indexing,
sometimes it doesn't. We tried decresing the heap size, but it does not
have that much of an effect. It makes the GC take a bit less time, but
makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but very
complex documents and queries. This is a 5-node cluster, the nodes have 32G
RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I feel
we might have to. Does anyone have any idea how to prevent these garbage
collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · December 29, 2014, 1:03pm

You said, very complex documents and queries, and 22 GB heap. Without
knowing more about your queries and filters, it is hard to comment. There
is default query/filter caching in some cases.

Jörg

On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky michal.taborsky@gmail.com
wrote:

Hi Jörg, thanks for your reply.

What do you mean if we have setup caching? We do not have any special
caching configuration, we use the defaults. How do you suggest we
reconfigure ES? That is what I am trying to find out.

All best,
Michal

Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):

You could use G1 GC for nicer behavior regarding application stop times,
but before tinkering with GC, it would be better to check if you have set
up caching, and if it is possible to clear caches or reconfigure ES.

Jörg

On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky michal....@gmail.com
wrote:

Hello everyone,

we are using ES as a backend of an online service and occasionally, we
are hit by a big garbage collection, which stops the node completely and
causes all sorts of problems. The nodes have plenty of memory I think.
During the GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m], collections
[3]/[2m], total [1.6m]/[17.6h], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy indexing,
sometimes it doesn't. We tried decresing the heap size, but it does not
have that much of an effect. It makes the GC take a bit less time, but
makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but very
complex documents and queries. This is a 5-node cluster, the nodes have 32G
RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I
feel we might have to. Does anyone have any idea how to prevent these
garbage collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

crimondi · December 30, 2014, 12:55am

+1 for using G1GC. In addition I would suggest not trying to fine tune GC
settings. If you have stop the world old GCs taking 20+ seconds you have a
more fundamental issue at play. I speak from experience on that. We had
similar issues and no amount of JVM/GC tuning could mask the fact we simply
didn't have enough memory.

If you aren't already doing so look at the amount of heap used by the
filter and field caches. Are you capping them? If you aren't expensive
queries could saturate your entire heap. Along the same line keep tabs on
your evictions. ES provides granular metrics so you can look at both filter
and field cache evictions.

On Mon, Dec 29, 2014 at 8:03 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You said, very complex documents and queries, and 22 GB heap. Without
knowing more about your queries and filters, it is hard to comment. There
is default query/filter caching in some cases.

Jörg

On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky <
michal.taborsky@gmail.com> wrote:

Hi Jörg, thanks for your reply.

What do you mean if we have setup caching? We do not have any special
caching configuration, we use the defaults. How do you suggest we
reconfigure ES? That is what I am trying to find out.

All best,
Michal

Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):

You could use G1 GC for nicer behavior regarding application stop times,
but before tinkering with GC, it would be better to check if you have set
up caching, and if it is possible to clear caches or reconfigure ES.

Jörg

On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky michal....@gmail.com
wrote:

Hello everyone,

we are using ES as a backend of an online service and occasionally, we
are hit by a big garbage collection, which stops the node completely and
causes all sorts of problems. The nodes have plenty of memory I think.
During the GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m],
collections [3]/[2m], total [1.6m]/[17.6h], memory
[21.1gb]->[6.5gb]/[22gb], all_pools {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy
indexing, sometimes it doesn't. We tried decresing the heap size, but it
does not have that much of an effect. It makes the GC take a bit less time,
but makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but very
complex documents and queries. This is a 5-node cluster, the nodes have 32G
RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I
feel we might have to. Does anyone have any idea how to prevent these
garbage collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Chris Rimondi | http://twitter.com/crimondi | securitygrit.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CA%2BqatLi-Y5VzH1cBKqXaE14sxAv95_dQgAxJsvrPZd6J7A%3Dg-g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Michal_Taborsky · December 30, 2014, 7:20am

Hi Christopher, thanks.

Field and filter caches are not the problem, I think, they occupy only
minority of the memory. The garbage collection in fact frees up a lot of
memory, so I think the problem is that the standard GC that is supposed to
run continuously cannot keep up. I will give G1 a try, though I have seen
in several places that it's not recommended as it's not stable enough.

Michal

Dne úterý, 30. prosince 2014 1:55:57 UTC+1 Chris Rimondi napsal(a):

+1 for using G1GC. In addition I would suggest not trying to fine tune GC
settings. If you have stop the world old GCs taking 20+ seconds you have a
more fundamental issue at play. I speak from experience on that. We had
similar issues and no amount of JVM/GC tuning could mask the fact we simply
didn't have enough memory.

If you aren't already doing so look at the amount of heap used by the
filter and field caches. Are you capping them? If you aren't expensive
queries could saturate your entire heap. Along the same line keep tabs on
your evictions. ES provides granular metrics so you can look at both filter
and field cache evictions.

On Mon, Dec 29, 2014 at 8:03 AM, joerg...@gmail.com <javascript:> <
joerg...@gmail.com <javascript:>> wrote:

You said, very complex documents and queries, and 22 GB heap. Without
knowing more about your queries and filters, it is hard to comment. There
is default query/filter caching in some cases.

Jörg

On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky <michal....@gmail.com
<javascript:>> wrote:

Hi Jörg, thanks for your reply.

What do you mean if we have setup caching? We do not have any special
caching configuration, we use the defaults. How do you suggest we
reconfigure ES? That is what I am trying to find out.

All best,
Michal

Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):

You could use G1 GC for nicer behavior regarding application stop
times, but before tinkering with GC, it would be better to check if you
have set up caching, and if it is possible to clear caches or reconfigure
ES.

Jörg

On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky <michal....@gmail.com

wrote:

Hello everyone,

we are using ES as a backend of an online service and occasionally, we
are hit by a big garbage collection, which stops the node completely and
causes all sorts of problems. The nodes have plenty of memory I think.
During the GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m],
collections [3]/[2m], total [1.6m]/[17.6h], memory
[21.1gb]->[6.5gb]/[22gb], all_pools {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy
indexing, sometimes it doesn't. We tried decresing the heap size, but it
does not have that much of an effect. It makes the GC take a bit less time,
but makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but
very complex documents and queries. This is a 5-node cluster, the nodes
have 32G RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I
feel we might have to. Does anyone have any idea how to prevent these
garbage collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Chris Rimondi | http://twitter.com/crimondi | securitygrit.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/91ce5fbf-dcb1-4b3c-bf15-2289dbacef66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · December 30, 2014, 9:21am

I'm interested in knowing more about G1 GC stability in Java 8, so I can
apply fixes to my production cluster, that is running stable for months
with G1 GC.

All I know are sporadic failures of Lucene 5 codec (which is under
development und not relaesed in ES) and a rare failure of a random junit
test on http://jenkins.elasticsearch.org (maybe a double free pointer) but
they seem not to be escalated into the OpenJDK issue tracker, so I can not
verify the cause, if it's G1 GC or not.

Jörg

On Tue, Dec 30, 2014 at 8:20 AM, Michal Taborsky michal.taborsky@gmail.com
wrote:

Hi Christopher, thanks.

Field and filter caches are not the problem, I think, they occupy only
minority of the memory. The garbage collection in fact frees up a lot of
memory, so I think the problem is that the standard GC that is supposed to
run continuously cannot keep up. I will give G1 a try, though I have seen
in several places that it's not recommended as it's not stable enough.

Michal

Dne úterý, 30. prosince 2014 1:55:57 UTC+1 Chris Rimondi napsal(a):

+1 for using G1GC. In addition I would suggest not trying to fine tune GC
settings. If you have stop the world old GCs taking 20+ seconds you have a
more fundamental issue at play. I speak from experience on that. We had
similar issues and no amount of JVM/GC tuning could mask the fact we simply
didn't have enough memory.

If you aren't already doing so look at the amount of heap used by the
filter and field caches. Are you capping them? If you aren't expensive
queries could saturate your entire heap. Along the same line keep tabs on
your evictions. ES provides granular metrics so you can look at both filter
and field cache evictions.

On Mon, Dec 29, 2014 at 8:03 AM, joerg...@gmail.com joerg...@gmail.com
wrote:

You said, very complex documents and queries, and 22 GB heap. Without
knowing more about your queries and filters, it is hard to comment. There
is default query/filter caching in some cases.

Jörg

On Mon, Dec 29, 2014 at 1:55 PM, Michal Taborsky michal....@gmail.com
wrote:

Hi Jörg, thanks for your reply.

What do you mean if we have setup caching? We do not have any special
caching configuration, we use the defaults. How do you suggest we
reconfigure ES? That is what I am trying to find out.

All best,
Michal

Dne pondělí, 29. prosince 2014 12:06:43 UTC+1 Jörg Prante napsal(a):

You could use G1 GC for nicer behavior regarding application stop
times, but before tinkering with GC, it would be better to check if you
have set up caching, and if it is possible to clear caches or reconfigure
ES.

Jörg

On Mon, Dec 29, 2014 at 10:36 AM, Michal Taborsky <
michal....@gmail.com> wrote:

Hello everyone,

we are using ES as a backend of an online service and occasionally,
we are hit by a big garbage collection, which stops the node completely and
causes all sorts of problems. The nodes have plenty of memory I think.
During the GC it looks like this.

[cz-dc-v-313] [gc][young][2270193][2282693] duration [1.6m],
collections [3]/[2m], total [1.6m]/[17.6h], memory
[21.1gb]->[6.5gb]/[22gb], all_pools {[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}
[cz-dc-v-313] [gc][old][2270193][2344] duration [24.1s], collections
[1]/[2m], total [24.1s]/[6.1m], memory [21.1gb]->[6.5gb]/[22gb], all_pools
{[young] [478.6mb]->[224.7mb]/[599mb]}{[survivor]
[74.8mb]->[0b]/[74.8mb]}{[old] [20.6gb]->[6.3gb]/[21.3gb]}

This might happen once a day, usually during a period of heavy
indexing, sometimes it doesn't. We tried decresing the heap size, but it
does not have that much of an effect. It makes the GC take a bit less time,
but makes it happen a bit more often.

The data is actually fairly small in size, about 30G in total, but
very complex documents and queries. This is a 5-node cluster, the nodes
have 32G RAM with 22G assigned to ES heap.

I know the manual says we should not touch the JVM GC settings but I
feel we might have to. Does anyone have any idea how to prevent these
garbage collections from ever happening?

Thanks,
Michal

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40goo
glegroups.com
https://groups.google.com/d/msgid/elasticsearch/29125088-8c43-4d97-b77b-71819fa11d09%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c84f17a2-0351-4473-aef3-5e4f08fc3c90%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWj
u1fSnqOHqGQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG%3D_meGtEDEUv7uCMZjLaibB28Mg4uBbWju1fSnqOHqGQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
Chris Rimondi | http://twitter.com/crimondi | securitygrit.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/91ce5fbf-dcb1-4b3c-bf15-2289dbacef66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/91ce5fbf-dcb1-4b3c-bf15-2289dbacef66%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEJmhJvJ0XTn9fmefowgPkZRsK59crU68sjxtCNawP9cQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Garbage collection causing long queries Elasticsearch	0	87	June 5, 2024
Garbage Collection in ES Elasticsearch	8	3729	July 6, 2017
Long running GC, cluster status RED, only few GB's data Elasticsearch	12	2524	July 5, 2017
Garbage collection and stop-of-the-world Elasticsearch	3	1031	February 7, 2020
Large heap usage with each node Elasticsearch	15	3692	July 5, 2017

Preventing stop-of-the-world garbage collection

Related topics