I would like to ask a question on the Multiline function of Logstash.
How does it work?
When does it "compile" the pattern into the event?
I noticed that during execution, the events are singular and when it is done executing. When it reaches the end of the File, it compiles what you specified into the Singular event?
I am asking this, because when I run my configuration Locally (which is faster than on the Server) it immediately shows the Lines compiled into one event.
So, what I am asking is. When does multiline compile the lines into one singular event?
I suspect it is once there are no more updates to the File. If not, why?
It's hard to understand what you're asking. The multiline filter/codec will publish events containing multiple lines as soon as it can, i.e. as soon as it has read the first line of the next event (well, the exact behavior depends on the multiline configuration). When Logstash reaches the end of the file things get trickier since there is no next event to wait for so Logstash could end up waiting forever. If you enable the codec's auto_flush_interval option it'll send whatever it has collected if nothing happens for a while.
As my pattern above, if it has a Timestamp pattern then it is a new event. Else it is a singular event. So new lines are specified by the datestamps.
But your suggestion of the auto_flush_interval is right. I think that is what I need, but how do I implement it?
I placed it in the multiline but I keep getting an error when Compiling, I also tried in the codec at the output part of the configuration.
I noticed that the problem when I run it locally, it is displaying correctly because my local copy of LS is version 2.3.2 while our servers version is 2.0.0.
How do I use the multiline filter if I am going to use LS 2.0.0?
How do I use the multiline filter if I am going to use LS 2.0.0?
The periodic_flush option seems to be available in LS 2.0 so it's surprising that it doesn't work. The multiline filter is being deprecated so I suggest you switch to the codec.
Would why you run an old version of Logstash? Note that you can upgrade plugins separate from the rest of Logstash.
I have applied the codec and it works. However there is a problem, it is not an error.
The trigger is a series of XML tags. Now these tags are numerous, like 3000. So yeah, an Out of Memory is inevitable if I do not limit them.
The thing that happens is, when I encounter the multiline_codec_max_lines_reached. Every subsequent event, is marked as multiline_codec_max_lines_reached.
When the XML Tags end, the next line starts with a Datestamp. Yet it is still treated as one line so it keeps reaching the max_lines.
[7/12/16 9:32:11:830 GMT] 000001f8 traceLogServi I
TraceLogMessage : 20160531-162719: The process to generate the SSC output file has started...
[7/12/16 9:32:11:998 GMT] 0000c6c5 BCSSCTriggerP I
Started checking the scheduled configuration to trigger the SSC output file generation...
[7/12/16 9:32:11:998 GMT] 0000c6c5 BCSSCTriggerP I
Setting the next schedule to null
The XML example I have is too long to post here, and I see that only image files can be Uploaded. Do you have an email I can send the .txt file with the XML Example? Or is there a way to get this to you here in the forum?
Okay. I don't understand what the problem is. If I raise max_lines sufficiently Logstash happily joins all lines into a single message. Here's your codec configuration but with max_lines added:
Yes it does look Good. So, yeah thing is my XML Tags reach to about 136,905 Lines.
I used Notepad ++ to count the Lines.
So, is it safe to declare 150,000 as MAX_LINES? Because with that volume I think an Out of Memory is inevitable. I would like to set it to 1,000 but once an event exceeds the 1,000 Lines. All subsequent Lines will be crumped up to a single event. Even though it does not.
You have a 150k line XML file in your logs? If you want to parse that with Logstash it should be possible but you may have to bump its heap size. It depends on the memory characteristics of the XML parser and how long each line is.
But on your end, when your data exceeds MAX_LINES does the next event get tagged as MAX_LINES_REACHED even though it shouldnt? Because that is what is happenning on my end.
For the above example, I made the MAX_LINES to 10. This example comes after my XML Event. Now, thing is it exceeded the MAX_LINES even though it clearly does not. For it fulfills the pattern and should have been segregated into seperate events in accordance to the Timestamp pattern.
So once an event exceeded the MAX_LINES, every subsequent event WILL exceed the MAX_LINES. Even though it should not.
That's not what I'm seeing with Logstash 2.3.4. If I run cat data data data | /opt/logstash/bin/logstash -f test.config I'm getting two messages to stdout, both tagged multi_tagged but not multiline_codec_max_lines_reached.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.