Replace @timestamp with actual timestamp from log file with two different formats

Hi,
I am trying to index my mail-relay log files to Elastic search. All the log entries are being indexed into a field named message . @timestamp field shows the time the entry was indexed and not the timestamp from the log entry. The structure of the data varies from line to line.

Below are the source log file lines sample:

2021-04-21 15:00:03 104.47.58.138 OutboundConnectionCommand SMTPSVC1 LWMAILVU1 - 25 BDAT - 10695+LAST 0 0 4 0 1969 SMTP - - - -
10.67.200.249 - example.domain.com [22/Mar/2021:22:00:00 -0600] "RCPT -? TO:<carlo.jimenez@domain.com> SMTP" 250 41

I have tried using dissect filter and is working fine for the above source log message line 1. However in order to parse the second line, it's failing as the timestamp is in between of the message.

Dissect Filter working for line1:

 input {
      beats {
        port => 5044
      }
    }

    filter {
    if "beats_input_codec_plain_applied" in [tags] {
            mutate {
                remove_tag => ["beats_input_codec_plain_applied"]
            }
        }
    if "mailrelayUAT" in [tags] {
    dissect { mapping => { "message" => "%{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}" } }
         date { match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss" ] }
     }
    }

Can we still use the dissect filter to achieve this or I need to look into grok filter.

I have also used grok as below.

Grok filter:

 input {
      beats {
        port => 5044
      }
    }

    filter {
    grok {
          match => [ "message", "(?<sourcestamp>(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3})" ]
        }
        date {
          match => [ "sourcestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
          target => "@timestamp"
          timezone => "America/Chicago"
    }
    }

Any help would be appreciated. Thank you!

I would suggest using grok filters to determine which format you have, then use a dissect filter for each. Something like

grok {
    match => { "message" => "^%{TIMESTAMP_ISO8601}" }
    add_tag => [ "startsWithTimestamp" ]
}
if "startsWithTimestamp" in [tags] {
    dissect { mapping => { "message" => "..." } }
}
grok {
    match => { "message" => "^%{IPV4}" }
    add_tag => [ "startsWithIp" ]
}
if "startsWithIp" in [tags] {
    dissect { mapping => { "message" => "..." } }
}
mutate { remove_tag => [ "_grokparsefailure" ] }

The _grokparsefailure will always get added for one or the other, so it conveys zero information, and you lose nothing by removing it.

Thanks @Badger for the reply.

Log line startsWithTimestamp:

2021-04-21 15:00:03 104.47.58.138 OutboundConnectionCommand SMTPSVC1 LWMAILVU1 - 25 BDAT - 10695+LAST 0 0 4 0 1969 SMTP - - - -

Log line startsWithIp:

10.67.200.249 - example.domain.com [22/Mar/2021:22:00:00 -0600] "RCPT -? TO:<carlo.jimenez@domain.com> SMTP" 250 41

I have adjusted my filter like below:

filter {
if "beats_input_codec_plain_applied" in [tags] {
        mutate {
            remove_tag => ["beats_input_codec_plain_applied"]
        }
    }
grok {
    match => { "message" => "^%{TIMESTAMP_ISO8601}" }
    add_tag => [ "startsWithTimestamp" ]
}
if "startsWithTimestamp" in [tags] {
dissect { mapping => { "message" => "%{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}" } }
     date { match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss" ] }
 }
}
grok {
    match => { "message" => "^%{IPV4}" }
    add_tag => [ "startsWithIp" ]
}
if "startsWithIp" in [tags] {
dissect { mapping => { "message" => "%{ip} %{?-} %{hostname} %{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}" } }
     date { match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss" ] }
 }
}
mutate { remove_tag => [ "_grokparsefailure" ] }

Let me know if it looks good?

Also please let me know If I can do the above same by only using GROK filter as I dont want to disturb the exisitng GROK logic for other formats of logs

Yes. If the only thing you want to extract is the timestamp it would be something like

grok {
    pattern_definitions => { "OTHER_TIMESTAMP" => "%{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{ISO8601_TIMEZONE}" }
    match => { "message" => (^%{TIMESTAMP_ISO8601:[@metadata][timestamp]}|\[%{OTHER_TIMESTAMP:[@metadata][timestamp]})"
}
10.67.200.249 - name.company.com [22/Mar/2021:22:00:00 -0600] "RCPT -? TO:<carlo.jimenez@company.com> SMTP" 250 41

I am using the below dissect pattern for logline above

%{ip} %{?-} %{hostname} %{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}

I am able to print the values but my timestamp is coming in brackets, is there a way I can remove the brackets.

Attached output from dissect-tester.

Use them as delimiters for dissect

%{ip} %{?-} %{hostname} [%{[@metadata][timestamp]}] %{}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.