Logstash 7.5 GC too frequently to work

That feels a lot like a bug. Your jmap shows a lot of memory used by org.logstash.FieldReference. Does it seem remotely possible that your JSON could have 17 million different field names? I would expect that to exceed field limits in Elasticsearch.

That said, when I run this configuration

input { generator { count => 3 lines => [ '' ] } }
filter {
    ruby { init => '@n = 0' code => '@n += 1; event.set("[@metadata][json]", "{ \"field#{@n}\": 0 }")' }
    json { source => "[@metadata][json]" }
}
output { stdout { codec => dots } }

then once logstash has started (with configuration reloading enabled) and the pipeline terminated I see

tenured generation total 194360K, used 116616K [0x00000000e5ea0000, 0x00000000f1c6e000, 0x0000000100000000)

That is the baseline to get logstash up and running. If I change 3 to 300000 then that changes to

tenured generation total 217760K, used 206162K [0x00000000e5ea0000, 0x00000000f3348000, 0x0000000100000000)

So it has grown by about 80 MB. If I change 300000 to 300001 that changes to

tenured generation total 335940K, used 202347K [0x00000000e5ea0000, 0x00000000fa6b1000, 0x0000000100000000)

so re-using the field names in the same process does not significantly grow the tenured usage. If I change 300001 to 600001 then I get

tenured generation total 340276K, used 277891K [0x00000000e5ea0000, 0x00000000faaed000, 0x0000000100000000)

so again around 80 MB more tenured usage for 300000 new field names. I don't know why logstash wants to remember that it has, in the past, seen a field name, but it feels like a performance optimization.

If I change the count to 500000 then I am able to get a heap dump, and mat tells me that over half the heap is used by leak suspect #1....

The class org.logstash.FieldReference , loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xe5ea4330 , occupies 121,394,832 (50.39%) bytes. The memory is accumulated in one instance of java.util.concurrent.ConcurrentHashMap$Node[] , loaded by , which occupies 119,199,664 (49.48%) bytes.

image

And after spending a couple of hours finding all that out, I see that there is an open issue for this, and a PR (that still needs work) to fix it.

2 Likes