Skip first few lines in file

tweetybird · December 30, 2015, 10:24pm

Is it possible to to skip (ie. not send them to logstash) lines in a log file?

The log I'm working with has 4 lines at the top which I would like to ignore and not send to logstash. Is this possible?

Joshua_Rich · December 30, 2015, 11:26pm

You could match the lines with a regexp then use the drop filter. So say the lines are comments starting with # then this filter snippet will remove them:

if [message] =~ /^#/ {
  drop { }
}

tweetybird · December 31, 2015, 12:08am

Hi Joshua,

Thanks for the tip.

Unfortunately each line is different. Will I need 4 different ifs or can I just use an OR and put in all my conditions?

In terms of the syntax, does it matter if this goes before or after the "if" grok?

I have:
filter { if [type] == "mylogtype" { grok {....} } }

P.S. How do you get that pretty looking code in your post?

Joshua_Rich · December 31, 2015, 12:17am

Hey @tweetybird,

Yep, sounds like you'll need a few OR's, so something like:

if [message] =~ /^match1/ or [message] =~ /^match2/ ...

This should go before your grok, all filters and conditional blocks are evaluated in the order they appear.

To get the pretty formatting, use three ` on a single line at the beginning and end of your code block.

tweetybird · December 31, 2015, 1:10am

Hi Joshua,

That seems to have done the trick. The very first line in the file is still getting picked up but it seems to be related to en encoding problem as it's some UTF-16 marker it seems.

In the logstash logs I see the first character on the first line is showing as <U+FEFF>

From Google it seems like that might be utf-16be but when I tried that, filebeat didn't seem to work correctly (this is on windows) .

What I have now in the filebeat.yml is utf-16le. Before it was set to plain and logstash logs had all kinds of \0000\0004 type things in the logs for the message.

I also tried utf-16bom-be as shown in the comments of the yml but that didn't seem to work either.

Is there a quick fix that I'm missing for what seems to be an minor encoding problem?

[edited for clarity]

tweetybird · December 31, 2015, 1:32am

Just to clarify a little better:

with utf-16be, it detects the logs files but doesn't detect any changes in it (filebeat logs show zero changes)

with utf-16be-bom, I see the following in the filebeat log:
ERR Error initializing harvester: unknown encoding('utf-16be-bom'

If I leave the encoding at plain, the lines are shipped to logstash but most of them have a tags field that says _grokparsefailure

When I open the file in notepad++, the little icon at the bottom says UCS-2 Little Endian which is why I treid utf-16le and except for the first character of the first line, everything seems to work.

Would be nice to get that first line figured out...

Joshua_Rich · December 31, 2015, 5:12am

Hmm, it's possible you are hitting this bug. Fix is merged and coming in filebeat 1.1.

ruflin · December 31, 2015, 2:25pm

I the next version of filebeat (1.1) you could the same directly on the filebeat side with exclude_lines: https://github.com/elastic/beats/pull/430

steffens · January 4, 2016, 9:12am

General problem with utf-16 is the encoding, big endian or little endian. That's why the bom (Byte Order Marker) was introduced, for processors to detect the endianness. If not given the default endianness is supposed to be big endian, but microsoft decided otherwise. That is by default on windows systems generating utf-16 you will mostly have to deal with utf-16le. Unfortunately bom is a little tricky to read if file is empty for so many seconds after creation, but with 1.1 we introduced the encodings utf-16be-bom, utf-16le-bom and utf-16-bom.

tweetybird · January 4, 2016, 3:36pm

Sounds like the best thing to do is wait for v1.1 and see if it helps. Would be easier to simply tell filebeat to ignore the first x number of lines but I guess the upcoming regex solution works too

ruflin · January 4, 2016, 3:38pm

The 1.1.0 snapshots are already here available and should be quite stable: https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

Topic		Replies	Views
Grok - Skipping Line Logstash	9	8102	July 6, 2017
FIleBeat not pushing few lines to Logstash Beats filebeat	4	418	December 18, 2021
Is it possible to drop some text with Filebeat Beats filebeat	2	588	November 25, 2016
Exclude log messages in logstash Logstash	8	2008	February 8, 2023
Logstash / Filebeat and "\|" sparated logs Logstash	6	1071	August 21, 2017

Skip first few lines in file

Related topics