Long GC Pauses

Vahid · December 3, 2013, 3:52pm

Hi all,

There are 50 indexes, each contains 3 primary shards and 1 replica. Some
threads running every 15 minutes to search and index a few documents(each
thread process at max 10 docs).

After some days, ES get into long GC pauses and at the end split brain
problem.

From the bigdesk we could see that more than 60% of heap is not used.

The only thing which we think could be a problem is many manually refresh
calls, but I'm not sure.

4 GB assigned to ES process, xms and xmx are set equally.

Total system memory is 12GB

3 other java processes using almost 4 GB.

ES version: 90.3

Java version: 1.7.0_25

Es vm configuration:

JAVA_OPTS="$JAVA_OPTS -XX:+UseParNewGC"

JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"

JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75"

JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly"

JAVA_OPTS="$JAVA_OPTS -XX:+UseCondCardMark"

JAVA_OPTS="$JAVA_OPTS -XX:+UseTLAB"

JAVA_OPTS="$JAVA_OPTS -XX:+CMSClassUnloadingEnabled"

JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=10000"

ES configuration:

ndex.cache.filter.max_size: 10

index.store.throttle.type: merge

index.compound_format: false

index.cache.field.expire: 1m

index.merge.policy.merge_factor: 30

index.cache.filter.expire: 1m

index.refresh_interval: -1

index.number_of_replicas: 1

index.version.created: 200599

index.store.throttle.max_bytes_per_sec: 5mb

index.number_of_shards: 3

index.translog.flush_threshold_period: 60s

index.merge.policy.use_compound_file: false

index.store.compress.stored: true

index.cache.field.type: resident

index.indices.memory.index_buffer_size: 20%

bootstrap.mlockall is not configured yet, but I think there is no problem
with memory swapping atm.

Can someone help?

Thanks in advance,

Vahid

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/51a19985-4f36-44af-b4dd-b2fb27556717%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

radu_gheorghe · December 3, 2013, 4:13pm

Hi Vahid,

I can't say what your problem is (maybe someone else has an insight - all
your settings look fine to me), but here are some "leads":

it would be interesting to know if switching to the G1 garbage collector
would help
maybe upgrading your JVM would help, even though yours is pretty fresh
it would be interesting to see how your memory pool and garbage
collection is doing over time. SPM for
Elasticsearchhttp://sematext.com/spm/elasticsearch-performance-monitoring/can
help you with that, and there's a free plan that should be good enough
for diagnostics. With this information, you'll probably be able to tune
your GC settings for shorter pauses (maybe share some graphs here and I'm
sure someone will give you useful hints)

Best regards,
Radu

On Tue, Dec 3, 2013 at 5:52 PM, Vahid vhasani57@gmail.com wrote:

Hi all,

There are 50 indexes, each contains 3 primary shards and 1 replica. Some
threads running every 15 minutes to search and index a few documents(each
thread process at max 10 docs).

After some days, ES get into long GC pauses and at the end split brain
problem.

From the bigdesk we could see that more than 60% of heap is not used.

The only thing which we think could be a problem is many manually refresh
calls, but I'm not sure.

4 GB assigned to ES process, xms and xmx are set equally.

Total system memory is 12GB

3 other java processes using almost 4 GB.

ES version: 90.3

Java version: 1.7.0_25

Es vm configuration:

JAVA_OPTS="$JAVA_OPTS -XX:+UseParNewGC"

JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"

JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75"

JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly"

JAVA_OPTS="$JAVA_OPTS -XX:+UseCondCardMark"

JAVA_OPTS="$JAVA_OPTS -XX:+UseTLAB"

JAVA_OPTS="$JAVA_OPTS -XX:+CMSClassUnloadingEnabled"

JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=10000"

ES configuration:

ndex.cache.filter.max_size: 10

index.store.throttle.type: merge

index.compound_format: false

index.cache.field.expire: 1m

index.merge.policy.merge_factor: 30

index.cache.filter.expire: 1m

index.refresh_interval: -1

index.number_of_replicas: 1

index.version.created: 200599

index.store.throttle.max_bytes_per_sec: 5mb

index.number_of_shards: 3

index.translog.flush_threshold_period: 60s

index.merge.policy.use_compound_file: false

index.store.compress.stored: true

index.cache.field.type: resident

index.indices.memory.index_buffer_size: 20%

bootstrap.mlockall is not configured yet, but I think there is no problem
with memory swapping atm.

Can someone help?

Thanks in advance,

Vahid

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/51a19985-4f36-44af-b4dd-b2fb27556717%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHXA0_2X6VkU2wE3SoLzy6RzOrLfynXFKyVKdHAQTPKvjY85yw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · December 4, 2013, 7:56am

Hey,

If you total system memory is 12GB, you have 4GB of heap, 4GB of other java
process, there are only 4GB left for the file system cache. This is pretty
easy to fill, if you are doing quite some searches on that machine. So this
makes setting bootstrap.mlockall crucial. If a garbage collection has to go
to disk/swap in order to collect garbage, I am not surprised it is very
slow.

Are there any specific reasons you set all this additional JVM setups?
There is no dynamic JVM language involved so why call with
CMSClassUnloadingEnabled? Can you try with the defaults first, before
tuning in order to eliminate those as a source of these problems? Same goes
for pause times, and tlab thread allocation. I dont see a special setup
here, why the standard settings should be a bad choice.

Also, upgrading your JVM should be postponed, as newer versions have
problems with Lucene which are not fixed yet, I would stay with the
current. I would not recommend using G1, but you are free to try - there
are people telling about speedups but there are at least as many people
telling about JVM crashes

Last, using nodes stats and graphing the output might make sense here, take
a special view at fielddata or maybe there is another part of the heap
space under pressure than the oldgen pool. See

--Alex

On Tue, Dec 3, 2013 at 5:13 PM, Radu Gheorghe radu.gheorghe@sematext.comwrote:

Hi Vahid,

I can't say what your problem is (maybe someone else has an insight - all
your settings look fine to me), but here are some "leads":

it would be interesting to know if switching to the G1 garbage collector
would help

maybe upgrading your JVM would help, even though yours is pretty fresh

it would be interesting to see how your memory pool and garbage
collection is doing over time. SPM for Elasticsearchhttp://sematext.com/spm/elasticsearch-performance-monitoring/can help you with that, and there's a free plan that should be good enough
for diagnostics. With this information, you'll probably be able to tune
your GC settings for shorter pauses (maybe share some graphs here and I'm
sure someone will give you useful hints)

Best regards,
Radu

On Tue, Dec 3, 2013 at 5:52 PM, Vahid vhasani57@gmail.com wrote:

Hi all,

There are 50 indexes, each contains 3 primary shards and 1 replica. Some
threads running every 15 minutes to search and index a few documents(each
thread process at max 10 docs).

After some days, ES get into long GC pauses and at the end split brain
problem.

From the bigdesk we could see that more than 60% of heap is not used.

The only thing which we think could be a problem is many manually refresh
calls, but I'm not sure.

4 GB assigned to ES process, xms and xmx are set equally.

Total system memory is 12GB

3 other java processes using almost 4 GB.

ES version: 90.3

Java version: 1.7.0_25

Es vm configuration:

JAVA_OPTS="$JAVA_OPTS -XX:+UseParNewGC"

JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC"

JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75"

JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly"

JAVA_OPTS="$JAVA_OPTS -XX:+UseCondCardMark"

JAVA_OPTS="$JAVA_OPTS -XX:+UseTLAB"

JAVA_OPTS="$JAVA_OPTS -XX:+CMSClassUnloadingEnabled"

JAVA_OPTS="$JAVA_OPTS -XX:MaxGCPauseMillis=10000"

ES configuration:

ndex.cache.filter.max_size: 10

index.store.throttle.type: merge

index.compound_format: false

index.cache.field.expire: 1m

index.merge.policy.merge_factor: 30

index.cache.filter.expire: 1m

index.refresh_interval: -1

index.number_of_replicas: 1

index.version.created: 200599

index.store.throttle.max_bytes_per_sec: 5mb

index.number_of_shards: 3

index.translog.flush_threshold_period: 60s

index.merge.policy.use_compound_file: false

index.store.compress.stored: true

index.cache.field.type: resident

index.indices.memory.index_buffer_size: 20%

bootstrap.mlockall is not configured yet, but I think there is no problem
with memory swapping atm.

Can someone help?

Thanks in advance,

Vahid

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/51a19985-4f36-44af-b4dd-b2fb27556717%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAHXA0_2X6VkU2wE3SoLzy6RzOrLfynXFKyVKdHAQTPKvjY85yw%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-aPr2%2B8Qu%2BtSbOLR4Yfbg-cBdncrL%2BQsbWXTq7zodzBQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Vahid · December 4, 2013, 10:44am

Hi,

Many thanks Radu and Alex for your replies,

Atm I'm not granted to install any application on customer system, so using
SPM for me is not an option.
I've created a screenshot of one of the nodes, maybe it give you more info.

Best regards,
Vahid

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/32c8f2ef-57d2-4461-a977-0d90d2ed19e8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Vahid · December 4, 2013, 10:51am

On this cluster(which graphs provided) bootstrap.mlockall=true is
configured and from the top command I see swap memory used is 0.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/81aecc2d-78cc-45a7-b18c-6bd08ceadeb9%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Very long GC Elasticsearch	11	6810	July 6, 2017
Long GC pauses but only one 1 host in the cluster Elasticsearch	3	347	July 6, 2017
Long GC pauses with ES 1.3.4 Elasticsearch	12	1503	July 5, 2017
Long gc pause happened on es1.7.0 plus jdk8u40 Elasticsearch	15	717	July 5, 2017
Finding why long GCs occur and fixing efficiency issues in Elastic Cluster Elasticsearch	4	1321	January 25, 2019

Long GC Pauses

Related topics