Differentiate between sections within a single log file

Abhilash_B · February 14, 2019, 7:30pm

Hey,

I have a log file which looks like this. Its basically a tab separated text file with a few lines of metadata text. I would want to extract "TIME" only as part of metadata field. For other tab separated lines, I would want to ignore data under "DOTS HEARD FROM BUT NOT IN CONFIGURATION" and "REPEATERS". I am using the csv filter with separator as "tab space", but I am not able to distinguish between the sections.

Badger · February 14, 2019, 8:21pm

I would do something very similar to the solution I proposed for the other format you had...

    if [message] =~ /^(\s*$|:::::)/ {
            drop {}
    } else if [message] =~ /^TIME/ {
            # parse it and stash it in a ruby class variable
    } else if [message] =~ "        .*      .*      " {
            csv {separator => "     " autodetect_column_names => true }
            # and append the metadata
    } else {
            drop {}
    }

Abhilash_B · February 15, 2019, 6:47am

Hey @Badger, I did try a similar solution before posting this question. The issue I am facing is, there will be 3 column header lines matching the condition " .* .* ". I want to drop the other two because they have no data rows under them. I want to parse only the ones which have data rows under the column headers. Hope you got my concern.

Badger · February 15, 2019, 2:38pm

I can only suggest an ad-hoc solution like dropping lines when one of the csv fields is text rather than numeric.

Abhilash_B · February 15, 2019, 6:45pm

Any idea how to do the same in logstash. I am not able to drop the lines within the csv filter.

Badger · February 15, 2019, 7:40pm

Let them go through the csv filter, and them drop them. Something like

if [count] =~ /[a-z]/ { drop {} }

should discard the lines starting with rpId and dotId

Abhilash_B · February 16, 2019, 7:26pm

Does this look correct, it isn't dropping the lines starting with rpld and dotId.

filter{
   if [message] =~ /^\s*$/ {
      drop {}
    } else if [message] =~ /^TIME/ {
      mutate { strip => [ "message" ] }
      ruby {
         init => '
                @@metadata = {}
            '
            code => '
                msg = event.get("message")
                matches = msg.scan(/^([A-Za-z0-9]+)::(.*)/)
                m = matches[0]
                @@metadata[m[0]] = m[1]
            '
        }
        drop {}
    } else if [message] =~ "	.*	.*	" {
        csv {
            separator => "	"
            columns => ["laneID","dotID","count","occ","reboots","batLvl","stuckHi","dwnTime","blips","channel","mode","t_slot","rssiAvg","rssiStd","lqiAvg","lqiStd","latAvg","latMed","spdAvg","spdMed","sdifAvg","sdif95","confidence","discover"]
        }
	if [count] =~ /[a-z]/ { drop {} }
        ruby {
            code => '
                event.set("metadata", @@metadata)
            '
        }
    }
}

Am I missing something here?

Badger · February 16, 2019, 7:34pm

%LOSS does not match /a-z/

if [count] =~ /[a-zA-Z]/ { drop {} }

plus you will need a final branch to process lines that do not contain tabs

} else { drop {} }

system · March 16, 2019, 7:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
If a single log file has data in different sections and each section is in a different format then how to parse it using logstash Logstash	3	737	January 24, 2018
Logstash: Exclude separator lines in logs Logstash	2	460	May 1, 2019
Parse log file with two formats in it Logstash	6	568	March 5, 2019
Logstash Tab / Space Delimiter Logstash	7	376	February 29, 2024
Can we use " :: " as a separator in csv filter in logstash Logstash	4	983	September 29, 2020

Differentiate between sections within a single log file

Related topics