Logstash 5 not working with multiline filter


#1

Hello,

I have a working configuration with Logstash 2.4.0 like this:

input {
  file {
    path => "/path/*"
    type => "type"
    start_position => "beginning"
  }
}
filter {
  multiline {
    pattern => "^<group"
    negate => "true"
    what => "previous"
  }
[...]

Migrating to Logstash 5 I've noticed that the pipeline crashes due to the multiline{} filter.
I can not use the codec because of the 500 rows limits.

Is this a known issue? What are the alternatives?
Elasticsearch 5 can run with Logstash 2.4.0? I suppose yes from my tests but a confirmation would be appreciated :slight_smile:


(Magnus Bäck) #2

I can not use the codec because of the 500 rows limits.

But that can be increased with the max_lines option, right?

Elasticsearch 5 can run with Logstash 2.4.0?

Yes: https://www.elastic.co/support/matrix#show_compatibility


#3

Hello @magnusbaeck,

Yes, I've tried to adopt the multiline-codec approach setting a greater max_lines parameter.
However, Logstash crashed with the following reason:

java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid1.hprof ...
Unable to create java_pid1.hprof: Permission denied
The signal INT is in use by the JVM and will not work correctly on this platform
Error: Your application used more memory than the safety cap of 1G.
Specify -J-Xmx####m to increase it (#### = cap size in MB).
Specify -w for full OutOfMemoryError stack trace

First thing fist: I'm running Logstash on a docker container.

Question 1: according to your opinion, was the OutOfMemoryError caused by a memory leak due to multiline-codec?
Question 2: Is it possible to change something to deal with this problem?

Thank you


(Magnus Bäck) #4

according to your opinion, was the OutOfMemoryError caused by a memory leak due to multiline-codec?

I'd rather suspect a bad configuration that causes Logstash to buffer lines in memory while waiting for that last line that'll "release" the buffered event, but that line never comes.


#5

I don't think so, because using the multiline filter the same set of logs worked correctly (using Logstash 2.4.0).
What would you suggest for analyzing further?


(Magnus Bäck) #6

Is it emitting events correctly?


#7

I've analyzed the /var/log/logstash/logstash_plain.log file.
First of all, since I was on my testing machine, I enabled the DEBUG logging level.
This caused the .log file to grow fairly big, about 1.5 GB. I don't know if it coud have been the cause of the issue here. BTW, is there a way to logrotate these logs? I would like to keep debugging but avoiding these huge files.

However, from the logstash_plain.log AFAICT I see that everything went fine for a number of files, after the discovery of the files I see the pattern:

[2016-11-06T17:17:34,469][DEBUG][logstash.inputs.file     ] Received line
[2016-11-06T17:17:34,469][DEBUG][logstash.codecs.multiline] Multiline

And then periodically

[2016-11-06T17:16:54,711][DEBUG][logstash.instrument.collector] Collector: Sending snapshot to observers

Which I think is the normal behaviour since Elasticsearch correctly received all events.

At a certain point, about 15 minutes before the pipeline stops, I see a first ERROR message

[2016-11-06T18:06:51,369][ERROR][logstash.instrument.periodicpoller.jvm] PeriodicPoller: exception

Followed by a number of the same ERROR and at the very end of the file:

[2016-11-06T18:20:10,297][ERROR][logstash.pipeline ] Exception in pipelineworker, the pipeline stopped processing new events, please check your filter configuration and restart Logstash.

Thanks again!


#8

I'm periodically getting the same error also with logging level = INFO

[2016-11-07T18:21:46,606][ERROR][logstash.instrument.periodicpoller.jvm] PeriodicPoller: exception {:poller=>#<LogStash::Instrument::PeriodicPoller::JVM:0x3faabf8c @task=#<Concurrent::TimerTask:0x2306e195 @observers=#<Concurrent::Collection::CopyOnNotifyObserverSet:0x3d2811a3 @observers={#<LogStash::Instrument::PeriodicPoller::JVM:0x3faabf8c ...>=>:update}>, @timeout_interval=60.0, @running=#<Concurrent::AtomicBoolean:0x73644e85>, @StoppedEvent=#<Concurrent::Event:0x677a0335 @set=false, @iteration=0>, @execution_interval=1.0, @do_nothing_on_deref=true, @run_now=nil, @freeze_on_deref=nil, @executor=#<Concurrent::SafeTaskExecutor:0x1aa2a370 @task=#<Proc:0x24569427@/usr/share/logstash/logstash-core/lib/logstash/instrument/periodic_poller/base.rb:52>, @exception_class=StandardError>, @StopEvent=#<Concurrent::Event:0x7517f64e @set=false, @iteration=0>, @value=nil, @copy_on_deref=nil, @dup_on_deref=nil>, @peak_threads=22, @peak_open_fds=3574, @metric=#<LogStash::Instrument::Metric:0x42c32072 @collector=#<LogStash::Instrument::Collector:0x1c4e64b5 @agent=nil, @metric_store=#<LogStash::Instrument::MetricStore:0x4d75b279 @store=#<Concurrent::Map:0x5d92e05c @default_proc=nil>, @structured_lookup_mutex=#<Mutex:0x177dc826>, @fast_lookup=#<Concurrent::Map:0x4ce6ba0b @default_proc=nil>>, @observer_state=true, @snapshot_task=#<Concurrent::TimerTask:0x2ce02f94 @observers=#<Concurrent::Collection::CopyOnNotifyObserverSet:0x66baaa54 @observers={#<LogStash::Instrument::Collector:0x1c4e64b5 ...>=>:update}>, @timeout_interval=600.0, @running=#<Concurrent::AtomicBoolean:0x6630a55b>, @StoppedEvent=#<Concurrent::Event:0x4b01aa86 @set=false, @iteration=0>, @execution_interval=1.0, @do_nothing_on_deref=true, @run_now=nil, @freeze_on_deref=nil, @executor=#<Concurrent::SafeTaskExecutor:0x785a41b8 @task=#<Proc:0x1c53b964@/usr/share/logstash/logstash-core/lib/logstash/instrument/collector.rb:87>, @exception_class=StandardError>, @StopEvent=#<Concurrent::Event:0x7f2acdc8 @set=false, @iteration=0>, @value=false, @copy_on_deref=nil, @dup_on_deref=nil>>>, @options={:polling_interval=>1, :polling_timeout=>60}>, :result=>nil, :exception=>#<Concurrent::TimeoutError: Concurrent::TimeoutError>, :executed_at=>2016-11-07 18:21:46 +0000}


(Magnus Bäck) #9

What did you set max_lines to? If you have a file output it's easy to see if it stops in the same place each time and, if so, if there's something wrong with the input that follows.


#10

My XML files are like this:

<level1 property1a="x" property1b="x">
  <level2 property2="y" property2b="x">
    <level3 property3="x" property2b="x">
    </level3>
  </level2>
</level1>

<level1> tag could span over 50000+ rows. My max_lines during the last Logstash runs was 10000.

If needed, I could set the multiline to work on level2 (which is significantly smaller, about 100 rows, but then I'm losing property1 values.


(Magnus Bäck) #11

10k lines shouldn't cause the heap to balloon so there's something fishy here. Have you verified that the XML parser can cope with XML documents containing >50k elements? I'd be more worried about that.


#12

Hi @magnusbaeck, how to check if the XML parser can cope with XML documents containing >50k elements?

Thank you :blush:


(Magnus Bäck) #13

I'm not sure how to go about it. I'd perhaps try writing a small Ruby script that uses the same XML parser to load one of your files. Then you could easily get it to slurp a large file in one bite and you get a chance at checking its memory characteristics.


#14

Hi @magnusbaeck, I hope I'll be able to do what you are suggesting me and I really thank you for your time.

Please consider though that my Logstash configuration involves a split filter also. I don't know at this point if this matters.

While the XML is like my previous example, in my Logstash configuration I'm splitting one of the levels (the huge one) like this:

filter {
  xml {
    source => "message"
    store_xml => "false"
    remove_namespaces => "true"
    xpath => ["level1/level2/level3", "level3"]
  }
  mutate {remove_field => ["message"]}
  if [level3][1] {split {field => "[level3]"}} #this is because I want only one level3 item per event
  xml {
    source => "level3"
    store_xml => "false"
    [...] #the rest of the story

Is the split filter safe in this context (it could generate thousands of events)?
Otherwise the only viable option I see would be to reduce the length of my XML files via a script outside Logstash and then parsing them.


(Magnus Bäck) #15

I don't know the code well enough to be able to say something authoritatively. You'll have you try it out. Over and out.


#16

Indeed :smiley:

May I thank you again for your time!


(system) #17

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.