Differentiate between sections within a single log file


#1

Hey,

I have a log file which looks like this. Its basically a tab separated text file with a few lines of metadata text. I would want to extract "TIME" only as part of metadata field. For other tab separated lines, I would want to ignore data under "DOTS HEARD FROM BUT NOT IN CONFIGURATION" and "REPEATERS". I am using the csv filter with separator as "tab space", but I am not able to distinguish between the sections.


#2

I would do something very similar to the solution I proposed for the other format you had...

    if [message] =~ /^(\s*$|:::::)/ {
            drop {}
    } else if [message] =~ /^TIME/ {
            # parse it and stash it in a ruby class variable
    } else if [message] =~ "        .*      .*      " {
            csv {separator => "     " autodetect_column_names => true }
            # and append the metadata
    } else {
            drop {}
    }

#3

Hey @Badger, I did try a similar solution before posting this question. The issue I am facing is, there will be 3 column header lines matching the condition " .* .* ". I want to drop the other two because they have no data rows under them. I want to parse only the ones which have data rows under the column headers. Hope you got my concern.


#4

I can only suggest an ad-hoc solution like dropping lines when one of the csv fields is text rather than numeric.


#5

Any idea how to do the same in logstash. I am not able to drop the lines within the csv filter.


#6

Let them go through the csv filter, and them drop them. Something like

if [count] =~ /[a-z]/ { drop {} }

should discard the lines starting with rpId and dotId


#7

Does this look correct, it isn't dropping the lines starting with rpld and dotId.

filter{
   if [message] =~ /^\s*$/ {
      drop {}
    } else if [message] =~ /^TIME/ {
      mutate { strip => [ "message" ] }
      ruby {
         init => '
                @@metadata = {}
            '
            code => '
                msg = event.get("message")
                matches = msg.scan(/^([A-Za-z0-9]+)::(.*)/)
                m = matches[0]
                @@metadata[m[0]] = m[1]
            '
        }
        drop {}
    } else if [message] =~ "	.*	.*	" {
        csv {
            separator => "	"
            columns => ["laneID","dotID","count","occ","reboots","batLvl","stuckHi","dwnTime","blips","channel","mode","t_slot","rssiAvg","rssiStd","lqiAvg","lqiStd","latAvg","latMed","spdAvg","spdMed","sdifAvg","sdif95","confidence","discover"]
        }
	if [count] =~ /[a-z]/ { drop {} }
        ruby {
            code => '
                event.set("metadata", @@metadata)
            '
        }
    }
}

Am I missing something here?


#8

%LOSS does not match /a-z/

if [count] =~ /[a-zA-Z]/ { drop {} }

plus you will need a final branch to process lines that do not contain tabs

} else { drop {} }


(system) closed #9

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.