Hi folks,
Similar issue happened at my side, Logstash JVM setting by default(-Xms256m -Xmx1g), memory leak issue occurred on each PROD servers. For instance, one of my Web Server, which got 32GB mem total, java.exe process occupied 25G memory in 2 months, issue mitigated by kill java process manually at last:
I was checking this problem today, tried to fix it by modifying LS_MAX_MEM, parameters as below, however, it seems not working, Windows memory still leaking.
I have no idea how to resolve it now, I guess the only option for me is create a task schedule restart Logstash service every day or week for now.
C:\logstash\bin\setup.bat
if "%LS_MIN_MEM%" == "" (
set LS_MIN_MEM=400m
)
if "%LS_MAX_MEM%" == "" (
set LS_MAX_MEM=400m
)
Both Xms and Xmx setting as 400m, Working set should be within 400M*1024=409600K, but Working Set and Private Working Set greater than it as below:
GC logging as below:
Java HotSpot(TM) 64-Bit Server VM (25.45-b02) for windows-amd64 JRE (1.8.0_45-b15), built on Apr 30 2015 12:40:44 by "java_re" with MS VC++ 10.0 (VS2010)
Memory: 4k page, physical 33541208k(28359468k free), swap 37733676k(32562904k free)
CommandLine flags: -XX:CMSInitiatingOccupancyFraction=75 -XX:+CMSParallelRemarkEnabled -XX:+HeapDumpOnOutOfMemoryError -XX:InitialHeapSize=419430400 -XX:InitialTenuringThreshold=1 -XX:MaxHeapSize=419430400 -XX:MaxNewSize=139812864 -XX:MaxTenuringThreshold=1 -XX:NewSize=139812864 -XX:OldPLABSize=16 -XX:OldSize=279617536 -XX:+PrintClassHistogram -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:SurvivorRatio=8 -XX:ThreadStackSize=2048 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:-UseLargePagesIndividualAllocation -XX:+UseParNewGC
0.705: Total time for which application threads were stopped: 0.0001910 seconds, Stopping threads took: 0.0000241 seconds
...
4.182: Total time for which application threads were stopped: 0.0001699 seconds, Stopping threads took: 0.0000342 seconds
4.264: [GC (CMS Initial Mark) [1 CMS-initial-mark: 0K(273088K)] 52455K(395968K), 0.0343662 secs] [Times: user=0.09 sys=0.00, real=0.03 secs]
4.299: Total time for which application threads were stopped: 0.0351564 seconds, Stopping threads took: 0.0001477 seconds
4.299: [CMS-concurrent-mark-start]
4.300: [CMS-concurrent-mark: 0.001/0.001 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
4.300: [CMS-concurrent-preclean-start]
4.302: [CMS-concurrent-preclean: 0.002/0.002 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
4.302: [CMS-concurrent-abortable-preclean-start]
4.375: Total time for which application threads were stopped: 0.0005524 seconds, Stopping threads took: 0.0000842 seconds
...
4299.401: Total time for which application threads were stopped: 0.0194531 seconds, Stopping threads took: 0.0000322 seconds
4308.326: [GC (Allocation Failure) 4308.326: [ParNew
**strong text**Desired survivor size 6979584 bytes, new threshold 1 (max 1)
- age 1: 320960 bytes, 320960 total
: 109771K->469K(122880K), 0.0115409 secs] 191521K->82289K(395968K), 0.0116299 secs] [Times: user=0.02 sys=0.00, real=0.02 secs]
4308.338: Total time for which application threads were stopped: 0.0120255 seconds, Stopping threads took: 0.0000568 seconds
4317.235: [GC (Allocation Failure) 4317.235: [ParNew
Desired survivor size 6979584 bytes, new threshold 1 (max 1)
- age 1: 376192 bytes, 376192 total
: 109717K->422K(122880K), 0.0114861 secs] 191537K->82310K(395968K), 0.0115837 secs] [Times: user=0.09 sys=0.00, real=0.02 secs]
OS: Windows Server 2003 and Windows Server 2008 R2
Logstash Version: logstash-1.5.4
JDK: jre1.8.0_60