Hi All,
I am running elastic 5.5 and my set up is having 3 client node, [data: false, master:false] and I am sending bulk request of 15000 from Logstash to client nodes.
I have:
3 Master
3 Client
10 Data
1 Logstash
I am getting these GC logs and then the client node is not responding and hence my cronjob is restarting the elastic process:
[2017-10-31T10:18:06,109][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][348] overhead, spent [619ms] collecting in the last [1s]
[2017-10-31T10:18:07,225][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][349] overhead, spent [877ms] collecting in the last [1.1s]
[2017-10-31T10:18:08,527][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][350] overhead, spent [1.1s] collecting in the last [1.3s]
[2017-10-31T10:18:09,577][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][351] overhead, spent [864ms] collecting in the last [1s]
[2017-10-31T10:18:10,748][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][352] overhead, spent [965ms] collecting in the last [1.1s]
[2017-10-31T10:18:11,749][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][353] overhead, spent [845ms] collecting in the last [1s]
[2017-10-31T10:18:12,750][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][354] overhead, spent [403ms] collecting in the last [1s]
[2017-10-31T10:18:20,922][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][362] overhead, spent [360ms] collecting in the last [1.1s]
[2017-10-31T10:18:22,934][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][364] overhead, spent [255ms] collecting in the last [1s]
[2017-10-31T10:18:23,935][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][365] overhead, spent [401ms] collecting in the last [1s]
[2017-10-31T10:18:24,936][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][366] overhead, spent [323ms] collecting in the last [1s]
[2017-10-31T10:18:36,107][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][old][370][4] duration [7.2s], collections [1]/[8s], total [7.2s]/[7.3s], memory [14.2gb]->[13.9gb]/[14.9gb], all_pools {[young] [25.9mb]->[5.9mb]/[399.4mb]}{[survivor] [49.8mb]->[0b]/[49.8mb]}{[old] [14.1gb]->[13.9gb]/[14.5gb]}
[2017-10-31T10:18:36,108][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][370] overhead, spent [7.2s] collecting in the last [8s]
[2017-10-31T10:18:46,756][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][old][375][6] duration [6s], collections [1]/[6.6s], total [6s]/[13.4s], memory [14.6gb]->[14.5gb]/[14.9gb], all_pools {[young] [286.8mb]->[32.7mb]/[399.4mb]}{[survivor] [49.8mb]->[0b]/[49.8mb]}{[old] [14.3gb]->[14.5gb]/[14.5gb]}
[2017-10-31T10:18:46,758][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][375] overhead, spent [6.2s] collecting in the last [6.6s]
[2017-10-31T10:18:52,991][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][old][376][7] duration [6s], collections [1]/[6.2s], total [6s]/[19.4s], memory [14.5gb]->[14.8gb]/[14.9gb], all_pools {[young] [32.7mb]->[389.4mb]/[399.4mb]}{[survivor] [0b]->[0b]/[49.8mb]}{[old] [14.5gb]->[14.5gb]/[14.5gb]}
[2017-10-31T10:18:52,991][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][376] overhead, spent [6s] collecting in the last [6.2s]
[2017-10-31T10:19:01,204][INFO ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][old][377][8] duration [8.1s], collections [1]/[8.2s], total [8.1s]/[27.6s], memory [14.8gb]->[14.9gb]/[14.9gb], all_pools {[young] [389.4mb]->[399.4mb]/[399.4mb]}{[survivor] [0b]->[44.1mb]/[49.8mb]}{[old] [14.5gb]->[14.5gb]/[14.5gb]}
[2017-10-31T10:19:01,204][WARN ][o.e.m.j.JvmGcMonitorService] [client-esinjest1] [gc][377] overhead, spent [8.1s] collecting in the last [8.2s]
Here is my heap details from jmap -heap
Heap Configuration:
   MinHeapFreeRatio         = 40
   MaxHeapFreeRatio         = 70
   MaxHeapSize              = 16106127360 (15360.0MB)
   NewSize                  = 523436032 (499.1875MB)
   MaxNewSize               = 523436032 (499.1875MB)
   OldSize                  = 15582691328 (14860.8125MB)
   NewRatio                 = 2
   SurvivorRatio            = 8
   MetaspaceSize            = 21807104 (20.796875MB)
   CompressedClassSpaceSize = 1073741824 (1024.0MB)
   MaxMetaspaceSize         = 17592186044415 MB
   G1HeapRegionSize         = 0 (0.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 471138304 (449.3125MB)
   used     = 355704016 (339.2257843017578MB)
   free     = 115434288 (110.08671569824219MB)
   75.49885309261545% used
Eden Space:
   capacity = 418840576 (399.4375MB)
   used     = 303406288 (289.3507843017578MB)
   free     = 115434288 (110.08671569824219MB)
   72.43956421261345% used
From Space:
   capacity = 52297728 (49.875MB)
   used     = 52297728 (49.875MB)
   free     = 0 (0.0MB)
   100.0% used
To Space:
   capacity = 52297728 (49.875MB)
   used     = 0 (0.0MB)
   free     = 52297728 (49.875MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 15582691328 (14860.8125MB)
   used     = 9416535272 (8980.307838439941MB)
   free     = 6166156056 (5880.504661560059MB)
   60.4294538972209% used
16820 interned Strings occupying 2487936 bytes.