Gc overhead while index data using logstash

Rajesh_Verma · March 7, 2019, 10:16am

System config:
Ram: 16 gb,
1tb Harddrive,

I am trying to index the around 2.5 crore using multi pipe line configuration using logstash

Running nine pipeline id which created the 45 index in single node with 5 shards to each node,

I am facing the gc over head issue everytime

Elastic version 6.6.1 with 6 gb java heap.
Logstash version 6.4 with 2 gb java heap.

Running 10 hours and index around 1.4 crore of data , the gc over head issue facing,

logstash keep waiting for some time then after it terminate with error .

Please suggested how to over come the issue what optimal solution should be keep to avoid this kind of the issue,

what is the No of shards per index is recommended.

Janko · March 7, 2019, 5:56pm

Hi Rajesh,

This sounds like you are having too many shards for the heap configured and index size. Having 5 shards per node per index sounds too much and you can easily have 1 shard for an index of roughly up to 50 GB. Each GB of heap should not hold more than 20 shards, ideally less.

You can find details on the sharding in https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster and regarding heap configurations we have https://www.elastic.co/blog/a-heap-of-trouble available.

Further it sounds like you are running Logstash and Elasticsearch on the same host. Ideally you want to have those on separate machines so they do not compete for resources. All our recommendations are based on running exclusively on a host.

Rajesh_Verma · March 8, 2019, 5:50am

Thanks for the reply .

I will implement the suggestion,

Please suggest how to merge the no of the shards without the loss of the data as my data on prod server.

Is there relation between the logstash heap and the elastic heap.

Suppose if the logstash had configure heap more than the elastic, is there issue with the elastic performance.

I already go through the heap blog, making me some confusion , When i configure the elastic with 6gb of heap then still facing GC over head issue, with blog it recommended in 64 bit system using 64 bit jvm below 26 GB that (i.e Below the compression oops pointer).

Rajesh_Verma · March 8, 2019, 12:15pm

please suggest solution

My logstash stop indexing and freeze at keep waiting

This my log of elastic

2019-03-08T17:26:58.442+0530: 243.065: [GC (Allocation Failure) 2019-03-08T17:26:58.442+0530: 243.065: [ParNew
Desired survivor size 17891328 bytes, new threshold 4 (max 6)
- age   1:   12015200 bytes,   12015200 total
- age   2:    1522784 bytes,   13537984 total
- age   3:    1825904 bytes,   15363888 total
- age   4:    5056304 bytes,   20420192 total
: 298431K->26680K(314560K), 0.0133668 secs] 622321K->350570K(1013632K), 0.0137152 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2019-03-08T17:26:58.442+0530: 243.079: Total time for which application threads were stopped: 0.0149138 seconds, Stopping threads took: 0.0001347 seconds
2019-03-08T17:27:05.505+0530: 250.127: Total time for which application threads were stopped: 0.0007369 seconds, Stopping threads took: 0.0002275 seconds
2019-03-08T17:27:07.536+0530: 252.158: Total time for which application threads were stopped: 0.0006658 seconds, Stopping threads took: 0.0002669 seconds
2019-03-08T17:27:08.271+0530: 252.906: [GC (Allocation Failure) 2019-03-08T17:27:08.271+0530: 252.906: [ParNew
Desired survivor size 17891328 bytes, new threshold 1 (max 6)
- age   1:   20196320 bytes,   20196320 total
- age   2:    6142344 bytes,   26338664 total
- age   3:    1480720 bytes,   27819384 total
- age   4:     930296 bytes,   28749680 total
: 306296K->34944K(314560K), 0.0264000 secs] 630186K->364259K(1013632K), 0.0267273 secs] [Times: user=0.38 sys=0.02, real=0.03 secs] 
2019-03-08T17:27:08.304+0530: 252.933: Total time for which application threads were stopped: 0.0276465 seconds, Stopping threads took: 0.0001388 seconds
2019-03-08T17:27:18.499+0530: 263.134: [GC (Allocation Failure) 2019-03-08T17:27:18.499+0530: 263.134: [ParNew
Desired survivor size 17891328 bytes, new threshold 1 (max 6)
- age   1:   23220392 bytes,   23220392 total
: 314560K->34944K(314560K), 0.0403785 secs] 643875K->384231K(1013632K), 0.0407082 secs] [Times: user=0.25 sys=0.00, real=0.05 secs] 
2019-03-08T17:27:18.546+0530: 263.175: Total time for which application threads were stopped: 0.0416168 seconds, Stopping threads took: 0.0001401 seconds
2019-03-08T17:27:20.546+0530: 265.168: Total time for which application threads were stopped: 0.0005934 seconds, Stopping threads took: 0.0002224 seconds
2019-03-08T17:27:21.546+0530: 266.168: Total time for which application threads were stopped: 0.0006728 seconds, Stopping threads took: 0.0002245 seconds
2019-03-08T17:27:22.593+0530: 267.215: Total time for which application threads were stopped: 0.0319508 seconds, Stopping threads took: 0.0313531 seconds
2019-03-08T17:27:23.609+0530: 268.231: Total time for which application threads were stopped: 0.0007938 seconds, Stopping threads took: 0.0004373 seconds
2019-03-08T17:27:24.484+0530: 269.111: [GC (Allocation Failure) 2019-03-08T17:27:24.484+0530: 269.111: [ParNew
Desired survivor size 17891328 bytes, new threshold 6 (max 6)
- age   1:    8828208 bytes,    8828208 total
: 314560K->21526K(314560K), 0.0290089 secs] 663847K->389215K(1013632K), 0.0293343 secs] [Times: user=0.19 sys=0.02, real=0.03 secs]

Janko · March 13, 2019, 7:48am

Please suggest how to merge the no of the shards without the loss of the data as my data on prod server.

That's where the Shrink API comes in handy
(https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-shrink-index.html and https://www.elastic.co/blog/resizing-elasticsearch-shards-for-fun-and-profit)

Suppose if the logstash had configure heap more than the elastic, is there issue with the elastic performance.

No, those are not directly related, it depends on the actual usage.

You say you used 6 GB from the beginning, this seems not sufficient, especially with the high amount of shards you are creating.

Rajesh_Verma · March 13, 2019, 9:43am

I had narrow down my shards now to 1 shards per index, And upgrades my RAM to 64 gb.

I still facing one issue while indexing the data either single pipe line or using multi pipe line , after index some data around 15% plus data logstash go to waiting and state keep waiting then after pipeline terminate flash on the cmd. I not able to under stand why still facing same issue upgrading the system , previously thinking might be issue with ram , but what now to do ,

I tried default and custom config but same issue,

I had 64 gb ram how much memory (Heap ) should i allocated.

i am running elastic and logstash both on same machine .

Please suggested me i m in big trouble.

Janko · March 22, 2019, 10:50am

Hey Rajesh, sorry for the late reply, I've been off for a while.

The recommendations for JVM heap configuration in Elasticsearch are generally based on having the service running exclusively on the node. You can find an in depth discussion about the parameters and configuration at https://www.elastic.co/blog/a-heap-of-trouble. Be sure to stick to around 31 GB max to avoid the uncompressed memory pointers.

Regarding the Logstash performance issues we have guidance on how to look into this further available at https://www.elastic.co/guide/en/logstash/current/performance-troubleshooting.html and https://www.elastic.co/guide/en/logstash/master/tuning-logstash.html.

What you describe sounds a bit like it might be either Logstash or Elasticsearch running out of memory and garbage collecting.
There are some considerations around this as well in https://discuss.elastic.co/t/logstash-heap-size-vs-elasticsearch-heap-size/133662.

Ideally you would like to separate both services and configure each node individually along the best practices given in the linked documentation.

All the best

system · April 19, 2019, 10:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash limitting ElasticSearch heap Elasticsearch	5	451	July 6, 2017
Indexing stops at low number of records due to GC overhead with proper heap settings Elasticsearch	10	1517	April 6, 2018
Gc overhead reduces ElasticSearch Performance Elasticsearch	14	13460	September 22, 2018
Hitting some limit on ElasticSearch Elasticsearch	5	625	July 6, 2017
Garbage collection log messages, [monitor.jvm ... duration [2.2m] Elasticsearch	8	11044	July 6, 2017

Gc overhead while index data using logstash

Related topics