Multiline codec and csv filter performance

Hi
Could someone please explain to me how the multicodec works (with pattern ) along with the CSV filter?

In my case I have a problem to handle
26,473,906 lines for one of pattern even though I cranked up the configuration for 32 GB heap memory and 48 workers does not handle my logstash this event.

currently config:

- pipeline.id: test_db
  path.config: "/usr/share/logstash/pipeline/pipeline_test_db.yml"
  pipeline.workers: 48
  pipeline.batch.size: 2000
  pipeline.batch.delay: 50
  pipeline.ordered: auto

As I understand that multicode send the bulk of date when it finds the end of the line for a specific pattern.
How do I make the slice to send in parts and free up memory? Is this possible?

You need to share your configuration, the contents of the /usr/share/logstash/pipeline/pipeline_test_db.yml file.

But, if you have a multiline that where the start and end pattern have 26 million lines, I doubt that this will work in Logstash, you need to find other ways to deal with a file this big, probably will need some pre-processing to slice it in smaller files.

Hi @leandrojmp this is the mentioned pipeline
it was crazy but I tried to use even 65 GB heap size
size and ended up failing (memory leak), of course.


warning: thread "[npdb]>worker38" terminated with exception (report_on_exception is true):warning: thread "[npdb]>worker26" terminated with exception (report_on_exception is true):


java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
        at java.util.ArrayList.<init>(java/util/ArrayList.java:154)
        at org.logstash.common.LsQueueUtils.drain(org/logstash/common/LsQueueUtils.java:78)
        at org.logstash.ext.JrubyMemoryReadClientExt.readBatch(org/logstash/ext/JrubyMemoryReadClientExt.java:83)
        at org.logstash.execution.WorkerLoop.run(org/logstash/execution/WorkerLoop.java:82)
        at jdk.internal.reflect.GeneratedMethodAccessor62.invoke(jdk/internal/reflect/GeneratedMethodAccessor62)
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(jdk/internal/reflect/DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(java/lang/reflect/Method.java:566)
        at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(org/jruby/javasupport/JavaMethod.java:441)
        at org.jruby.javasupport.JavaMethod.invokeDirect(org/jruby/javasupport/JavaMethod.java:305)
        at java.lang.invoke.LambdaForm$DMH/0x00007f167d5c9440.invokeVirtual(java/lang/invoke/LambdaForm$DMH)
        at java.lang.invoke.LambdaForm$MH/0x00007f167d5c7440.invoke(java/lang/invoke/LambdaForm$MH)
        at java.lang.invoke.LambdaForm$MH/0x00007f167d5c5c40.reinvoke(java/lang/invoke/LambdaForm$MH)
        at java.lang.invoke.LambdaForm$MH/0x00007f167d5c5440.guard(java/lang/invoke/LambdaForm$MH)
        at java.lang.invoke.LambdaForm$MH/0x00007f167d5c5c40.reinvoke(java/lang/invoke/LambdaForm$MH)
        at java.lang.invoke.LambdaForm$MH/0x00007f167d5c5440.guard(java/lang/invoke/LambdaForm$MH)
        at java.lang.invoke.Invokers$Holder.linkToCallSite(java/lang/invoke/Invokers$Holder)
        at usr.share.logstash.logstash_minus_core.lib.logstash.java_pipeline.start_workers(/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:300)
        at java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(java/lang/invoke/DirectMethodHandle$Holder)
        at java.lang.invoke.LambdaForm$MH/0x00007f163ab0a440.invoke(java/lang/invoke/LambdaForm$MH)
        at java.lang.invoke.Invokers$Holder.invokeExact_MT(java/lang/invoke/Invokers$Holder)
        at org.jruby.RubyProc.call(org/jruby/RubyProc.java:318)
        at java.lang.Thread.run(java/lang/Thread.java:829)
warning: thread "Ruby-0-Thread-134: :1" terminated with exception (report_on_exception is true):
warning: thread "Ruby-0-Thread-136: :1" terminated with exception (report_on_exception is true):warning: thread "Ruby-0-Thread-147: :1" terminated with exception (report_on_exception is true):

the processing file consists with such so You can see that such file has a lot of rows with DNs

||Line         1: # snapshot,66760085,20220721044503|
|---|---|
||Line         2: # Network Entities|
||Line      3440: # Imsis|
||Line      3441: # DNs|
||Line 26477347: # DN Blocks|
||Line 26479335: # Numbers|
||Line 26798461: # Number Blocks|
||Line 26798462: # 20220721051007|

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.