Match complete line after some regex pattern

I have lines in document like this.

*** Begin time:
Sat Jun 26 21:11:14 AEST 2019

Want to assign new field begin_time = Sat Jun 26 21:11:14 AEST 2019

This is what I tried:

if x =~ /(\bBegin time:\s+(.*))/
                    begin_time = x

return empty string

if x =~ /(\b?.:?Begin time:\s+(\S+.*))/
                    begin_time = x

also returns empty string. Trying it out in rubular, it works (mathes the new line content) - Rubular: \bBegin time:\s+(.*)

Hi,

You can use grok for that

grok {
  match {
    "x" => "Begin time:\s+%{GREEDYDATA:begin_time}$"
  }
}

But you can't create and edit a field like you try to (with only an association).

Cad.

Unfotunately, this block does not work (I keep getting errors), this is "expanded" code

filter {
        ruby {
          code => '
            lines = event.get("message").lines(chomp: true)
            begin_time = ""
            end_time = ""
              lines.each { |x|
                if x =~ /Begin time:)/
                    begin_time = x
                elsif x =~ /(End time)/
                    end_time = x
...

In the same file (log) I have begin and end time timestamps, and also couple of other variables.. and all of them works ok if they are on the same line, but cannot succeed to map value from next row, like in example.

[2021-09-17T14:34:16,392][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of #, => at line 118, column 8 (byte 4435) after filter {\n\truby {\n..

This is a pipeline configuration error, something is missing, maybe a curly bracket or double quotes was not closed.

It says where the error is: Message=>"Expected one of #, => at line 118, column 8

I know, but it appears when i add grok in the filter, otherwise it does not..

This is the code, I don't really see anything wrong..

input {
    file {
        path => "/etc/logstash/files/*"
	    codec => multiline {
                pattern => "^$"
                negate => true
                what => next
	        max_lines => 14000
	        auto_flush_interval => 5
            }
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}

filter {
	ruby {
          code => '
            lines = event.get("message").lines(chomp: true)
	    begin_time = ""
	    end_time = ""
            logData = ""
            user = ""
            lines.each { |x|
		if x =~ /(Begin time)/
                    begin_time = x
	        elsif x =~ /(End time)/
                    end_time = x
                    #end_time = end_time * ""
		elsif x =~ /^(USER)/
		    user = x.scan(/(?:.?USER=)(.*)/ )[0]
		    user = user * ""
		 else
                    unless x =~ /^(\s|sending|total|rsl|sent|\[)/
                        logData += x + ","
                    end
                end
            }
	    event.set("begin_time", begin_time)
            event.set("end_time", end_time)
            event.set("logData", logData)
            event.set("user", user)
        '
    }
       
// I PUT GROK HERE AND GET AN ERROR
// grok {
// match {
//    "x" => "Begin time:\s+%{GREEDYDATA:begin_time}$"
//  }
//}

       mutate {
          remove_field => ["tags", "message", "@version", "@timestamp", "host", "path"]
    }
}

output{
    elasticsearch {
        hosts => ["XXXX"]
	index => "log_logs"
 }
  stdout {
        codec => rubydebug
   }
}

I managed to do something and grok is now working, but since my file has multiple lines after "begin time", it saves under "begin time" field all the rest of the document, how can I make him to store only that one (next) line where the date is?

Thank you, now it looks like:

"begin_time" : """
Sat Jun 26 16:56:16 AEST 2021
rm: cannot remove '/scrXXXX/003': No such file or directory
IDS=4947802324992
ENV=BATCH
LD_LIBRARY_PATH=/apps/ncl/6.6.2/l
...

That's not a line, it is two lines. You use a multiline codec to combine multiple lines into a single [message] field, but then you are using

lines = event.get("message").lines(chomp: true)

in order to process the lines one at a time. You could try something like

message = event.get("message")
mdata = message.match(/Begin Time:[^\n]*\n([^\n]*)\n/)
if mdata
    event.set("begin_time", mdata[0])
end

I'm using chomp true because I'm extracting 30 more fields which are in the same line ( field: value, eg.). Ony begin_time and end_time has value in new line.

I added new message variable and tried out this match, and the result is:

"start_time" : """
Starting time:
Sat Jun 26 16:56:16 AEST 2021
"""

Looks like it should be event.set("begin_time", mdata[1])

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.