Hey Everyone,
I am having an issue when I output to a syslog sever using the CEF codec from the logs we receive from Filbeat. The logs contain mulitple individual CEF longs into one log, so sometimes you can get 3 different logs in one CEF packet. This has been confirmed by running tcpdump at the server receiving the logs . Is there a way to break these up? Below is the single CEF logs treated as one individual log packet event.
The interestingMatches bit is needed because the parentheses used for alternation in (CEF:0|$) also make it a capture group, so there are two capture groups for every match and we only want the first one: (CEF:0.*?)
I don't think your output codec will do what you want but that's a separate question.
.scan searches the string for occurrences of the regular expression. The regexp matches "CEF:0" plus non-greedy additional text up until it finds an occurrence of either "CEF:0" or $ (the end of the string). The non-greedy part is important
/(CEF:0.*?)(?=(CEF:0|$))/
will capture four matches. It saves the match the first time the lookahead hits.
/(CEF:0.*)(?=(CEF:0|$))/
will capture the entire string in one match. It does not stop and save the match until the last time the lookahead hits (at end of string).
I am so sorry. I forgot to mention that my filter is taking a NON CEF event and converting it to CEF format and thats where the trouble is.....it seems to put a bunch of them together So filter is below
so the log files look like the below and I have my filter below
You have not set a delimiter, and the code defaults to not using one (the default value for that field is an empty string). If whatever you are sending data to expects delimited messages then you will need to set it. And given that TCP is a stream-based protocol it is pretty much certain the receiver expects a delimeter.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.