Garbage collection

After about 24 to 36 hours the heap is full on the 7 nodes in our
elasticsearch cluster. Old generation space is full, and collection times
spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4]
[gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections
[1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb],
all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space]
[125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen]
[41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6]
[gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s],
total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code
Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space]
[227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen]
[41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

  • JVM version is 17.0-b16
  • using parallel threads in the new generation.
  • using thread-local object allocation.
  • Concurrent Mark-Sweep GC
  • Heap Configuration:
  • MinHeapFreeRatio = 40
  • MaxHeapFreeRatio = 70
  • MaxHeapSize = 17179869184 (16384.0MB)
  • NewSize = 21757952 (20.75MB)
  • MaxNewSize = 501612544 (478.375MB)
  • OldSize = 65404928 (62.375MB)
  • NewRatio = 7
  • SurvivorRatio = 8
  • PermSize = 21757952 (20.75MB)
  • MaxPermSize = 88080384 (84.0MB)
  • Heap Usage:
  • New Generation (Eden + 1 Survivor Space):
  • capacity = 451477504 (430.5625MB)
  • used = 429000296 (409.1265640258789MB)
  • free = 22477208 (21.435935974121094MB)
  • 95.02141129937672% used
  • Eden Space:
  • capacity = 401342464 (382.75MB)
  • used = 390832008 (372.72644805908203MB)
  • free = 10510456 (10.023551940917969MB)
  • 97.38117519505735% used
  • From Space:
  • capacity = 50135040 (47.8125MB)
  • used = 38168288 (36.400115966796875MB)
  • free = 11966752 (11.412384033203125MB)
  • 76.130961499183% used
  • To Space:
  • capacity = 50135040 (47.8125MB)
  • used = 0 (0.0MB)
  • free = 50135040 (47.8125MB)
  • 0.0% used
  • concurrent mark-sweep generation:
  • capacity = 16678256640 (15905.625MB)
  • used = 16592262064 (15823.614181518555MB)
  • free = 85994576 (82.01081848144531MB)
  • 99.48439109760575% used
  • Perm Generation:
  • capacity = 74588160 (71.1328125MB)
  • used = 43132472 (41.13433074951172MB)
  • free = 31455688 (29.99848175048828MB)
  • 57.827505062465676% used

  • JVM stats:

Heap Used:15.6gbHeap Committed:15.9gbNon Heap Used:53.4mbNon Heap
Committed:83.6mbJVM Uptime:25 hours, 24 minutes, 13 secondsThread
Count/Peak:1744 / 1751GC Count:22343GC Time:17 minutes, 55 seconds and
739 millisecondsJava Version:1.6.0_21JVM Vendor:Sun Microsystems Inc.JVM:Java
HotSpot(TM) 64-Bit Server VM

  • Indices stats:
  • Documents:3220688140Documents Deleted:0Store Size:1528.2gbIndex Req
    Total:337592403Delete Req Total:0Get Req Total:0Get(Exists) Total:0Get(Missing)
    Total:0Query Total:52895Fetch Total:4136

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You said that you are using a 1.7 Jvm, right?
But we can see that:

Java Version: 1.6.0_21
So what is the real version you are using?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 17 oct. 2013 à 16:33, shift brian.malia@etrade.com a écrit :

After about 24 to 36 hours the heap is full on the 7 nodes in our elasticsearch cluster. Old generation space is full, and collection times spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4] [gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections [1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb], all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space] [125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen] [41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6] [gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s], total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space] [227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen] [41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

JVM version is 17.0-b16

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 17179869184 (16384.0MB)
NewSize = 21757952 (20.75MB)
MaxNewSize = 501612544 (478.375MB)
OldSize = 65404928 (62.375MB)
NewRatio = 7
SurvivorRatio = 8
PermSize = 21757952 (20.75MB)
MaxPermSize = 88080384 (84.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 451477504 (430.5625MB)
used = 429000296 (409.1265640258789MB)
free = 22477208 (21.435935974121094MB)
95.02141129937672% used
Eden Space:
capacity = 401342464 (382.75MB)
used = 390832008 (372.72644805908203MB)
free = 10510456 (10.023551940917969MB)
97.38117519505735% used
From Space:
capacity = 50135040 (47.8125MB)
used = 38168288 (36.400115966796875MB)
free = 11966752 (11.412384033203125MB)
76.130961499183% used
To Space:
capacity = 50135040 (47.8125MB)
used = 0 (0.0MB)
free = 50135040 (47.8125MB)
0.0% used
concurrent mark-sweep generation:
capacity = 16678256640 (15905.625MB)
used = 16592262064 (15823.614181518555MB)
free = 85994576 (82.01081848144531MB)
99.48439109760575% used
Perm Generation:
capacity = 74588160 (71.1328125MB)
used = 43132472 (41.13433074951172MB)
free = 31455688 (29.99848175048828MB)
57.827505062465676% used


JVM stats:

Heap Used: 15.6gb
Heap Committed: 15.9gb
Non Heap Used: 53.4mb
Non Heap Committed: 83.6mb
JVM Uptime: 25 hours, 24 minutes, 13 seconds
Thread Count/Peak: 1744 / 1751
GC Count: 22343
GC Time: 17 minutes, 55 seconds and 739 milliseconds
Java Version: 1.6.0_21
JVM Vendor: Sun Microsystems Inc.
JVM: Java HotSpot(TM) 64-Bit Server VM

Indices stats:

Documents: 3220688140
Documents Deleted: 0
Store Size: 1528.2gb
Index Req Total: 337592403
Delete Req Total: 0
Get Req Total: 0
Get(Exists) Total: 0
Get(Missing) Total: 0
Query Total: 52895
Fetch Total: 4136

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sorry for any confusion, I am using java version: 1.6.0_21

On Thursday, October 17, 2013 12:15:47 PM UTC-4, David Pilato wrote:

You said that you are using a 1.7 Jvm, right?
But we can see that:

  • Java Version:1.6.0_21

So what is the real version you are using?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr
| @scrutmydocs https://twitter.com/scrutmydocs

Le 17 oct. 2013 à 16:33, shift <brian...@etrade.com <javascript:>> a
écrit :

After about 24 to 36 hours the heap is full on the 7 nodes in our
elasticsearch cluster. Old generation space is full, and collection times
spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4]
[gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections
[1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb],
all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space]
[125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen]
[41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6]
[gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s],
total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code
Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space]
[227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen]
[41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

  • JVM version is 17.0-b16
  • using parallel threads in the new generation.
  • using thread-local object allocation.
  • Concurrent Mark-Sweep GC
  • Heap Configuration:
  • MinHeapFreeRatio = 40
  • MaxHeapFreeRatio = 70
  • MaxHeapSize = 17179869184 (16384.0MB)
  • NewSize = 21757952 (20.75MB)
  • MaxNewSize = 501612544 (478.375MB)
  • OldSize = 65404928 (62.375MB)
  • NewRatio = 7
  • SurvivorRatio = 8
  • PermSize = 21757952 (20.75MB)
  • MaxPermSize = 88080384 (84.0MB)
  • Heap Usage:
  • New Generation (Eden + 1 Survivor Space):
  • capacity = 451477504 (430.5625MB)
  • used = 429000296 (409.1265640258789MB)
  • free = 22477208 (21.435935974121094MB)
  • 95.02141129937672% used
  • Eden Space:
  • capacity = 401342464 (382.75MB)
  • used = 390832008 (372.72644805908203MB)
  • free = 10510456 (10.023551940917969MB)
  • 97.38117519505735% used
  • From Space:
  • capacity = 50135040 (47.8125MB)
  • used = 38168288 (36.400115966796875MB)
  • free = 11966752 (11.412384033203125MB)
  • 76.130961499183% used
  • To Space:
  • capacity = 50135040 (47.8125MB)
  • used = 0 (0.0MB)
  • free = 50135040 (47.8125MB)
  • 0.0% used
  • concurrent mark-sweep generation:
  • capacity = 16678256640 (15905.625MB)
  • used = 16592262064 (15823.614181518555MB)
  • free = 85994576 (82.01081848144531MB)
  • 99.48439109760575% used
  • Perm Generation:
  • capacity = 74588160 (71.1328125MB)
  • used = 43132472 (41.13433074951172MB)
  • free = 31455688 (29.99848175048828MB)
  • 57.827505062465676% used

  • JVM stats:

Heap Used:15.6gbHeap Committed:15.9gbNon Heap Used:53.4mbNon Heap
Committed:83.6mbJVM Uptime:25 hours, 24 minutes, 13 secondsThread
Count/Peak:1744 / 1751GC Count:22343GC Time:17 minutes, 55 seconds and
739 millisecondsJava Version:1.6.0_21JVM Vendor:Sun Microsystems Inc.
JVM:Java HotSpot(TM) 64-Bit Server VM

  • Indices stats:
  • Documents:3220688140Documents Deleted:0Store Size:1528.2gbIndex Req
    Total:337592403Delete Req Total:0Get Req Total:0Get(Exists) Total:0Get(Missing)
    Total:0Query Total:52895Fetch Total:4136

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You wrote:

JVM version is 17.0-b16
That's why I was confused.
Could you update to Java7?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 18 oct. 2013 à 17:37, shift brian.malia@etrade.com a écrit :

JVM version is 17.0-b16

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

This post replaces the previous one. (Sorry for the noise).

You wrote:

JVM version is 17.0-b16
That's why I was confused.
Could you update to Java7?

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 18 oct. 2013 à 17:37, shift brian.malia@etrade.com a écrit :

My replies keep getting deleted, am I posting something wrong?

I mentioned that my java version is 1.6.0_21.

On Thursday, October 17, 2013 12:15:47 PM UTC-4, David Pilato wrote:

You said that you are using a 1.7 Jvm, right?
But we can see that:

Java Version: 1.6.0_21
So what is the real version you are using?

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 17 oct. 2013 à 16:33, shift brian...@etrade.com a écrit :

After about 24 to 36 hours the heap is full on the 7 nodes in our elasticsearch cluster. Old generation space is full, and collection times spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4] [gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections [1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb], all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space] [125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen] [41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6] [gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s], total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space] [227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen] [41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

JVM version is 17.0-b16

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 17179869184 (16384.0MB)
NewSize = 21757952 (20.75MB)
MaxNewSize = 501612544 (478.375MB)
OldSize = 65404928 (62.375MB)
NewRatio = 7
SurvivorRatio = 8
PermSize = 21757952 (20.75MB)
MaxPermSize = 88080384 (84.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 451477504 (430.5625MB)
used = 429000296 (409.1265640258789MB)
free = 22477208 (21.435935974121094MB)
95.02141129937672% used
Eden Space:
capacity = 401342464 (382.75MB)
used = 390832008 (372.72644805908203MB)
free = 10510456 (10.023551940917969MB)
97.38117519505735% used
From Space:
capacity = 50135040 (47.8125MB)
used = 38168288 (36.400115966796875MB)
free = 11966752 (11.412384033203125MB)
76.130961499183% used
To Space:
capacity = 50135040 (47.8125MB)
used = 0 (0.0MB)
free = 50135040 (47.8125MB)
0.0% used
concurrent mark-sweep generation:
capacity = 16678256640 (15905.625MB)
used = 16592262064 (15823.614181518555MB)
free = 85994576 (82.01081848144531MB)
99.48439109760575% used
Perm Generation:
capacity = 74588160 (71.1328125MB)
used = 43132472 (41.13433074951172MB)
free = 31455688 (29.99848175048828MB)
57.827505062465676% used


JVM stats:

Heap Used: 15.6gb
Heap Committed: 15.9gb
Non Heap Used: 53.4mb
Non Heap Committed: 83.6mb
JVM Uptime: 25 hours, 24 minutes, 13 seconds
Thread Count/Peak: 1744 / 1751
GC Count: 22343
GC Time: 17 minutes, 55 seconds and 739 milliseconds
Java Version: 1.6.0_21
JVM Vendor: Sun Microsystems Inc.
JVM: Java HotSpot(TM) 64-Bit Server VM

Indices stats:

Documents: 3220688140
Documents Deleted: 0
Store Size: 1528.2gb
Index Req Total: 337592403
Delete Req Total: 0
Get Req Total: 0
Get(Exists) Total: 0
Get(Missing) Total: 0
Query Total: 52895
Fetch Total: 4136

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Brian

You want to disable swapping, either by turning swap off altogether, or by
using mlockall

clint

On 17 October 2013 16:33, shift brian.malia@etrade.com wrote:

After about 24 to 36 hours the heap is full on the 7 nodes in our
elasticsearch cluster. Old generation space is full, and collection times
spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4]
[gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections
[1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb],
all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space]
[125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen]
[41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6]
[gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s],
total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code
Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space]
[227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen]
[41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

  • JVM version is 17.0-b16
  • using parallel threads in the new generation.
  • using thread-local object allocation.
  • Concurrent Mark-Sweep GC
  • Heap Configuration:
  • MinHeapFreeRatio = 40
  • MaxHeapFreeRatio = 70
  • MaxHeapSize = 17179869184 (16384.0MB)
  • NewSize = 21757952 (20.75MB)
  • MaxNewSize = 501612544 (478.375MB)
  • OldSize = 65404928 (62.375MB)
  • NewRatio = 7
  • SurvivorRatio = 8
  • PermSize = 21757952 (20.75MB)
  • MaxPermSize = 88080384 (84.0MB)
  • Heap Usage:
  • New Generation (Eden + 1 Survivor Space):
  • capacity = 451477504 (430.5625MB)
  • used = 429000296 (409.1265640258789MB)
  • free = 22477208 (21.435935974121094MB)
  • 95.02141129937672% used
  • Eden Space:
  • capacity = 401342464 (382.75MB)
  • used = 390832008 (372.72644805908203MB)
  • free = 10510456 (10.023551940917969MB)
  • 97.38117519505735% used
  • From Space:
  • capacity = 50135040 (47.8125MB)
  • used = 38168288 (36.400115966796875MB)
  • free = 11966752 (11.412384033203125MB)
  • 76.130961499183% used
  • To Space:
  • capacity = 50135040 (47.8125MB)
  • used = 0 (0.0MB)
  • free = 50135040 (47.8125MB)
  • 0.0% used
  • concurrent mark-sweep generation:
  • capacity = 16678256640 (15905.625MB)
  • used = 16592262064 (15823.614181518555MB)
  • free = 85994576 (82.01081848144531MB)
  • 99.48439109760575% used
  • Perm Generation:
  • capacity = 74588160 (71.1328125MB)
  • used = 43132472 (41.13433074951172MB)
  • free = 31455688 (29.99848175048828MB)
  • 57.827505062465676% used

  • JVM stats:

Heap Used:15.6gbHeap Committed:15.9gbNon Heap Used:53.4mbNon Heap
Committed:83.6mbJVM Uptime:25 hours, 24 minutes, 13 secondsThread
Count/Peak:1744 / 1751GC Count:22343GC Time:17 minutes, 55 seconds and
739 millisecondsJava Version:1.6.0_21JVM Vendor:Sun Microsystems Inc.
JVM:Java HotSpot(TM) 64-Bit Server VM

  • Indices stats:
  • Documents:3220688140Documents Deleted:0Store Size:1528.2gbIndex Req
    Total:337592403Delete Req Total:0Get Req Total:0Get(Exists) Total:0Get(Missing)
    Total:0Query Total:52895Fetch Total:4136

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I already use mlockall, it is set to true.

bootstrap.mlockall: true

On Friday, October 18, 2013 1:00:48 PM UTC-4, Clinton Gormley wrote:

Hi Brian

You want to disable swapping, either by turning swap off altogether, or by
using mlockall

Elasticsearch Platform — Find real-time answers at scale | Elastic

clint

On 17 October 2013 16:33, shift <brian...@etrade.com <javascript:>> wrote:

After about 24 to 36 hours the heap is full on the 7 nodes in our
elasticsearch cluster. Old generation space is full, and collection times
spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4]
[gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections
[1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb],
all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space]
[125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen]
[41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6]
[gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s],
total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code
Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space]
[227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen]
[41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

  • JVM version is 17.0-b16
  • using parallel threads in the new generation.
  • using thread-local object allocation.
  • Concurrent Mark-Sweep GC
  • Heap Configuration:
  • MinHeapFreeRatio = 40
  • MaxHeapFreeRatio = 70
  • MaxHeapSize = 17179869184 (16384.0MB)
  • NewSize = 21757952 (20.75MB)
  • MaxNewSize = 501612544 (478.375MB)
  • OldSize = 65404928 (62.375MB)
  • NewRatio = 7
  • SurvivorRatio = 8
  • PermSize = 21757952 (20.75MB)
  • MaxPermSize = 88080384 (84.0MB)
  • Heap Usage:
  • New Generation (Eden + 1 Survivor Space):
  • capacity = 451477504 (430.5625MB)
  • used = 429000296 (409.1265640258789MB)
  • free = 22477208 (21.435935974121094MB)
  • 95.02141129937672% used
  • Eden Space:
  • capacity = 401342464 (382.75MB)
  • used = 390832008 (372.72644805908203MB)
  • free = 10510456 (10.023551940917969MB)
  • 97.38117519505735% used
  • From Space:
  • capacity = 50135040 (47.8125MB)
  • used = 38168288 (36.400115966796875MB)
  • free = 11966752 (11.412384033203125MB)
  • 76.130961499183% used
  • To Space:
  • capacity = 50135040 (47.8125MB)
  • used = 0 (0.0MB)
  • free = 50135040 (47.8125MB)
  • 0.0% used
  • concurrent mark-sweep generation:
  • capacity = 16678256640 (15905.625MB)
  • used = 16592262064 (15823.614181518555MB)
  • free = 85994576 (82.01081848144531MB)
  • 99.48439109760575% used
  • Perm Generation:
  • capacity = 74588160 (71.1328125MB)
  • used = 43132472 (41.13433074951172MB)
  • free = 31455688 (29.99848175048828MB)
  • 57.827505062465676% used

  • JVM stats:

Heap Used: 15.6gb Heap Committed: 15.9gb Non Heap Used: 53.4mb Non
Heap Committed: 83.6mb JVM Uptime: 25 hours, 24 minutes, 13 secondsThread Count/Peak:1744 / 1751GC Count:22343GC Time:17
minutes, 55 seconds and 739 milliseconds Java Version: 1.6.0_21 JVM
Vendor: Sun Microsystems Inc. JVM: Java HotSpot(TM) 64-Bit Server VM

  • Indices stats:
  • Documents: 3220688140 Documents Deleted: 0 Store Size: 1528.2gbIndex Req Total:337592403Delete Req Total:0Get Req Total:0Get(Exists) Total:0Get(Missing) Total:0Query Total:52895Fetch Total:4136

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

Some ideas:

  • try another GC (G1?) with the latest version of Java 7
  • share your ES settings
  • I wonder how heap growth correlates to growth in some other ES
    metrics. Elasticsearch Monitoring will
    tell us/you.
  • 3.*B docs with just 16GB heap? How many servers, how much RAM?
  • Re swap stuff Clinton mention, there is also swapiness you can set to 0
    for less aggressive swapping if you don't want to turn it off completely

Otis

Solr & Elasticsearch Support -- http://sematext.com/
Performance Monitoring -- Sematext Monitoring | Infrastructure Monitoring Service

On Thursday, October 17, 2013 10:33:37 AM UTC-4, shift wrote:

After about 24 to 36 hours the heap is full on the 7 nodes in our
elasticsearch cluster. Old generation space is full, and collection times
spike up to 25+ second pauses.

Do y'all have any JVM tuning recommendations?

i.e. -

[2013-10-14 10:19:05,058][WARN ][monitor.jvm ] [node4]
[gc][ConcurrentMarkSweep][248210][3] duration [25.5s], collections
[1]/[26.3s], total [25.5s]/[25.7s], memory [13gb]->[12.9gb]/[15.9gb],
all_pools {[Code Cache] [11.9mb]->[11.9mb]/[48mb]}{[Par Eden Space]
[125.7mb]->[3.4mb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [12.8gb]->[12.8gb]/[15.5gb]}{[CMS Perm Gen]
[41.8mb]->[41.3mb]/[84mb]}

[2013-10-17 07:24:53,816][INFO ][monitor.jvm ] [node6]
[gc][ConcurrentMarkSweep][90318][48] duration [6s], collections [1]/[6.3s],
total [6s]/[5.1m], memory [15.7gb]->[15.4gb]/[15.9gb], all_pools {[Code
Cache] [12.2mb]->[12.2mb]/[48mb]}{[Par Eden Space]
[227.7mb]->[294kb]/[382.7mb]}{[Par Survivor Space] [47.8mb]->[0b]/[47.8mb]}{[CMS
Old Gen] [15.5gb]->[15.4gb]/[15.5gb]}{[CMS Perm Gen]
[41.1mb]->[41.1mb]/[84mb]}

Java settings:

-Xms16384m -Xmx16384m -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError

Heap details:

  • JVM version is 17.0-b16
  • using parallel threads in the new generation.
  • using thread-local object allocation.
  • Concurrent Mark-Sweep GC
  • Heap Configuration:
  • MinHeapFreeRatio = 40
  • MaxHeapFreeRatio = 70
  • MaxHeapSize = 17179869184 (16384.0MB)
  • NewSize = 21757952 (20.75MB)
  • MaxNewSize = 501612544 (478.375MB)
  • OldSize = 65404928 (62.375MB)
  • NewRatio = 7
  • SurvivorRatio = 8
  • PermSize = 21757952 (20.75MB)
  • MaxPermSize = 88080384 (84.0MB)
  • Heap Usage:
  • New Generation (Eden + 1 Survivor Space):
  • capacity = 451477504 (430.5625MB)
  • used = 429000296 (409.1265640258789MB)
  • free = 22477208 (21.435935974121094MB)
  • 95.02141129937672% used
  • Eden Space:
  • capacity = 401342464 (382.75MB)
  • used = 390832008 (372.72644805908203MB)
  • free = 10510456 (10.023551940917969MB)
  • 97.38117519505735% used
  • From Space:
  • capacity = 50135040 (47.8125MB)
  • used = 38168288 (36.400115966796875MB)
  • free = 11966752 (11.412384033203125MB)
  • 76.130961499183% used
  • To Space:
  • capacity = 50135040 (47.8125MB)
  • used = 0 (0.0MB)
  • free = 50135040 (47.8125MB)
  • 0.0% used
  • concurrent mark-sweep generation:
  • capacity = 16678256640 (15905.625MB)
  • used = 16592262064 (15823.614181518555MB)
  • free = 85994576 (82.01081848144531MB)
  • 99.48439109760575% used
  • Perm Generation:
  • capacity = 74588160 (71.1328125MB)
  • used = 43132472 (41.13433074951172MB)
  • free = 31455688 (29.99848175048828MB)
  • 57.827505062465676% used

  • JVM stats:

Heap Used:15.6gbHeap Committed:15.9gbNon Heap Used:53.4mbNon Heap
Committed:83.6mbJVM Uptime:25 hours, 24 minutes, 13 secondsThread
Count/Peak:1744 / 1751GC Count:22343GC Time:17 minutes, 55 seconds and
739 millisecondsJava Version:1.6.0_21JVM Vendor:Sun Microsystems Inc.
JVM:Java HotSpot(TM) 64-Bit Server VM

  • Indices stats:
  • Documents:3220688140Documents Deleted:0Store Size:1528.2gbIndex Req
    Total:337592403Delete Req Total:0Get Req Total:0Get(Exists) Total:0Get(Missing)
    Total:0Query Total:52895Fetch Total:4136

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi

  • try another GC (G1?) with the latest version of Java 7

Please don't use G1 garbage collection - we see crashes with it.

  • Re swap stuff Clinton mention, there is also swapiness you can set to 0
    for less aggressive swapping if you don't want to turn it off completely

You want swap turned off completely. Even small amounts of swap affect
garbage collection badly. Memory usage of ES is controlled by the heap
size, so there is no need for swap.

Clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Upgraded elasticsearch to 0.90.7 and Java to 1.7u45, performance is much
better and the heap is not running out of memory.

However, nodes are still being removed from the cluster due to long garbage
collections (1.5m), I have increased the ping timeout until this can be
addressed.

The cluster setup:

  • 7 nodes
  • 16G heap size
  • hourly indices (13 TB total)
    -- sizes range from 33.7gb (94,108,626 docs) to 48.2gb (131,902,014 docs)
    -- 1 replica

Relevant settings:

bootstrap.mlockall: true
index.indices.memory.index_buffer_size : 50%
index.replication : async
index.refresh_interval : 180s
index.routing.allocation.total_shards_per_node : 2
index.translog.flush_threshold_ops : 50000
index.store.compress.stored : true
index.number_of_shards : 4
index.cache.field.type: node
indices.cache.filter.size: 40%
index.fielddata.cache: node
indices.fielddata.cache.size: 40%

i.e. -

[2013-11-21 10:44:38,818][INFO ][cluster.service ] [node_master]
removed {[node_client][egfpsadf][inet[node_client]]{master=true},}, reason:
zen-disco-node_failed([node_client][egfpsadf][inet[node_client]]{master=true}),
reason failed to ping, tried [3] times, each with maximum [30s] timeout

[2013-11-21 10:44:39,494][WARN ][monitor.jvm ] [node_client]
[gc][ConcurrentMarkSweep][92071][58] duration [1.5m], collections
[1]/[1.5m], total [1.5m]/[1.9m], memory [12.2gb]->[10.9gb]/[15.8gb],
all_pools {[Code Cache] [15.5mb]->[15.5mb]/[48mb]}{[Par Eden Space]
[1.4gb]->[4.1mb]/[1.4gb]}{[Par Survivor Space]
[191.3mb]->[0b]/[191.3mb]}{[CMS Old Gen] [10.6gb]->[10.9gb]/[14.1gb]}{[CMS
Perm Gen] [35.2mb]->[34.7mb]/[82mb]}

[2013-11-21 10:44:39,533][INFO ][discovery.zen ] [node_client]
master_left [[node_master][AAAzzzBBBZZA][inet[/node_ip]]{master=true}],
reason [do not exists on master, act as master failure]

On Sunday, October 20, 2013 7:18:12 AM UTC-4, Clinton Gormley wrote:

Hi

  • try another GC (G1?) with the latest version of Java 7

Please don't use G1 garbage collection - we see crashes with it.

  • Re swap stuff Clinton mention, there is also swapiness you can set to 0
    for less aggressive swapping if you don't want to turn it off completely

You want swap turned off completely. Even small amounts of swap affect
garbage collection badly. Memory usage of ES is controlled by the heap
size, so there is no need for swap.

Clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

That's a long time for GC. Check how many threads are being used for GC
and consider increasing. You can see them if you do a thread dump.

Could you benefit from DocValues?
Doc values integration. by jpountz · Pull Request #3829 · elastic/elasticsearch · GitHub

Check DocValues related slides
on Presentation: Solr for Analytics - Sematext -
one of them compares JVM heap with and without DocValues.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Thursday, November 21, 2013 3:40:36 PM UTC-5, shift wrote:

Upgraded elasticsearch to 0.90.7 and Java to 1.7u45, performance is much
better and the heap is not running out of memory.

However, nodes are still being removed from the cluster due to long
garbage collections (1.5m), I have increased the ping timeout until this
can be addressed.

The cluster setup:

  • 7 nodes
  • 16G heap size
  • hourly indices (13 TB total)
    -- sizes range from 33.7gb (94,108,626 docs) to 48.2gb (131,902,014 docs)
    -- 1 replica

Relevant settings:

bootstrap.mlockall: true
index.indices.memory.index_buffer_size : 50%
index.replication : async
index.refresh_interval : 180s
index.routing.allocation.total_shards_per_node : 2
index.translog.flush_threshold_ops : 50000
index.store.compress.stored : true
index.number_of_shards : 4
index.cache.field.type: node
indices.cache.filter.size: 40%
index.fielddata.cache: node
indices.fielddata.cache.size: 40%

i.e. -

[2013-11-21 10:44:38,818][INFO ][cluster.service ] [node_master]
removed {[node_client][egfpsadf][inet[node_client]]{master=true},}, reason:
zen-disco-node_failed([node_client][egfpsadf][inet[node_client]]{master=true}),
reason failed to ping, tried [3] times, each with maximum [30s] timeout

[2013-11-21 10:44:39,494][WARN ][monitor.jvm ] [node_client]
[gc][ConcurrentMarkSweep][92071][58] duration [1.5m], collections
[1]/[1.5m], total [1.5m]/[1.9m], memory [12.2gb]->[10.9gb]/[15.8gb],
all_pools {[Code Cache] [15.5mb]->[15.5mb]/[48mb]}{[Par Eden Space]
[1.4gb]->[4.1mb]/[1.4gb]}{[Par Survivor Space]
[191.3mb]->[0b]/[191.3mb]}{[CMS Old Gen] [10.6gb]->[10.9gb]/[14.1gb]}{[CMS
Perm Gen] [35.2mb]->[34.7mb]/[82mb]}

[2013-11-21 10:44:39,533][INFO ][discovery.zen ] [node_client]
master_left [[node_master][AAAzzzBBBZZA][inet[/node_ip]]{master=true}],
reason [do not exists on master, act as master failure]

On Sunday, October 20, 2013 7:18:12 AM UTC-4, Clinton Gormley wrote:

Hi

  • try another GC (G1?) with the latest version of Java 7

Please don't use G1 garbage collection - we see crashes with it.

  • Re swap stuff Clinton mention, there is also swapiness you can set to
    0 for less aggressive swapping if you don't want to turn it off completely

You want swap turned off completely. Even small amounts of swap affect
garbage collection badly. Memory usage of ES is controlled by the heap
size, so there is no need for swap.

Clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The load you put on the garbage collector depends on your indexing and your
queries, how they use caching. You can not expect tuning advise without
giving the chance to know more details about the indexing and the queries
stressing the heap.

A first impression is

  • index.indices.memory.index_buffer_size : 50% --> why only 50%?

  • indices.cache.filter.size: 40% --> should be increased

  • indices.fielddata.cache.size: 40% --> should be increased

Although you had set up 16G for your heap, collecting objects from 12G to
10G takes 1.5 minutes. This looks like you put very large data on the heap
constantly.

This can be improved by running GC more frequently, and by adjusting
maximum heap usage. The standard GC config in ES starts GC before the heap
is full, to reduce max GC time.

You have several options:

  • reduce stress on heap by improving your indexing and queries and
    configure your caching right

  • not reducing stress, but run smaller heaps (your heap of 16GB is very
    large to get quick GC by the standard GC algorithm)

  • you can also add more nodes so you have more CPU and smaller heaps to
    distribute the GC stress

If you really want large heaps, you could experiment with G1 GC because the
cause for G1 crashes (GNU trove library) has been removed in the
0.90.7 release.

As said before: such extra long GC times can also be injected from outside
events that has nothing to do with ES or JVM, like slow I/O when swapping
or a RAID rebuild, just to complete the reasons for slow GC.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I am using logstash to store log messages in elasticsearch and kibana 3 to
query the log messages.

The analyzer for the message field is nonword.

I can increase index_buffer_size, filter.size and cache.size -- do you
recommend any particular value? Won't this increase the heap usage?

i.e. - @timestamp is usually the largest consumer

"total" : {

  "fielddata" : {

    "memory_size" : "11.4gb",

    "memory_size_in_bytes" : 12340496114,

    "evictions" : 172,

    "fields" : {

      "message" : {

        "memory_size" : "0b",

        "memory_size_in_bytes" : 0

      },

      "@fields.log" : {

        "memory_size" : "403mb",

        "memory_size_in_bytes" : 422602936

      },

      "@fields.deployment" : {

        "memory_size" : "277.7mb",

        "memory_size_in_bytes" : 291245354

      },

      "@timestamp" : {

        "memory_size" : "10.8gb",

        "memory_size_in_bytes" : 11626647824

      }

    }

  }

}

}

On Friday, November 22, 2013 3:41:56 AM UTC-5, Jörg Prante wrote:

The load you put on the garbage collector depends on your indexing and
your queries, how they use caching. You can not expect tuning advise
without giving the chance to know more details about the indexing and the
queries stressing the heap.

A first impression is

  • index.indices.memory.index_buffer_size : 50% --> why only 50%?

  • indices.cache.filter.size: 40% --> should be increased

  • indices.fielddata.cache.size: 40% --> should be increased

Although you had set up 16G for your heap, collecting objects from 12G to
10G takes 1.5 minutes. This looks like you put very large data on the heap
constantly.

This can be improved by running GC more frequently, and by adjusting
maximum heap usage. The standard GC config in ES starts GC before the heap
is full, to reduce max GC time.

You have several options:

  • reduce stress on heap by improving your indexing and queries and
    configure your caching right

  • not reducing stress, but run smaller heaps (your heap of 16GB is very
    large to get quick GC by the standard GC algorithm)

  • you can also add more nodes so you have more CPU and smaller heaps to
    distribute the GC stress

If you really want large heaps, you could experiment with G1 GC because
the cause for G1 crashes (GNU trove library) has been removed in the
0.90.7 release.

As said before: such extra long GC times can also be injected from outside
events that has nothing to do with ES or JVM, like slow I/O when swapping
or a RAID rebuild, just to complete the reasons for slow GC.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.