Replace @timestamp with actual timestamp from log file with two different formats

kommineni24 · April 26, 2021, 2:02pm

Hi,
I am trying to index my mail-relay log files to Elastic search. All the log entries are being indexed into a field named message . @timestamp field shows the time the entry was indexed and not the timestamp from the log entry. The structure of the data varies from line to line.

Below are the source log file lines sample:

2021-04-21 15:00:03 104.47.58.138 OutboundConnectionCommand SMTPSVC1 LWMAILVU1 - 25 BDAT - 10695+LAST 0 0 4 0 1969 SMTP - - - -
10.67.200.249 - example.domain.com [22/Mar/2021:22:00:00 -0600] "RCPT -? TO:<carlo.jimenez@domain.com> SMTP" 250 41

I have tried using dissect filter and is working fine for the above source log message line 1. However in order to parse the second line, it's failing as the timestamp is in between of the message.

Dissect Filter working for line1:

 input {
      beats {
        port => 5044
      }
    }

    filter {
    if "beats_input_codec_plain_applied" in [tags] {
            mutate {
                remove_tag => ["beats_input_codec_plain_applied"]
            }
        }
    if "mailrelayUAT" in [tags] {
    dissect { mapping => { "message" => "%{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}" } }
         date { match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss" ] }
     }
    }

Can we still use the dissect filter to achieve this or I need to look into grok filter.

I have also used grok as below.

Grok filter:

 input {
      beats {
        port => 5044
      }
    }

    filter {
    grok {
          match => [ "message", "(?<sourcestamp>(\d){4}-(\d){2}-(\d){2} (\d){2}:(\d){2}:(\d){2},(\d){3})" ]
        }
        date {
          match => [ "sourcestamp" , "yyyy-MM-dd HH:mm:ss,SSS" ]
          target => "@timestamp"
          timezone => "America/Chicago"
    }
    }

Any help would be appreciated. Thank you!

Badger · April 26, 2021, 4:40pm

I would suggest using grok filters to determine which format you have, then use a dissect filter for each. Something like

grok {
    match => { "message" => "^%{TIMESTAMP_ISO8601}" }
    add_tag => [ "startsWithTimestamp" ]
}
if "startsWithTimestamp" in [tags] {
    dissect { mapping => { "message" => "..." } }
}
grok {
    match => { "message" => "^%{IPV4}" }
    add_tag => [ "startsWithIp" ]
}
if "startsWithIp" in [tags] {
    dissect { mapping => { "message" => "..." } }
}
mutate { remove_tag => [ "_grokparsefailure" ] }

The _grokparsefailure will always get added for one or the other, so it conveys zero information, and you lose nothing by removing it.

kommineni24 · April 26, 2021, 5:25pm

Thanks @Badger for the reply.

Log line startsWithTimestamp:

2021-04-21 15:00:03 104.47.58.138 OutboundConnectionCommand SMTPSVC1 LWMAILVU1 - 25 BDAT - 10695+LAST 0 0 4 0 1969 SMTP - - - -

Log line startsWithIp:

10.67.200.249 - example.domain.com [22/Mar/2021:22:00:00 -0600] "RCPT -? TO:<carlo.jimenez@domain.com> SMTP" 250 41

I have adjusted my filter like below:

filter {
if "beats_input_codec_plain_applied" in [tags] {
        mutate {
            remove_tag => ["beats_input_codec_plain_applied"]
        }
    }
grok {
    match => { "message" => "^%{TIMESTAMP_ISO8601}" }
    add_tag => [ "startsWithTimestamp" ]
}
if "startsWithTimestamp" in [tags] {
dissect { mapping => { "message" => "%{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}" } }
     date { match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss" ] }
 }
}
grok {
    match => { "message" => "^%{IPV4}" }
    add_tag => [ "startsWithIp" ]
}
if "startsWithIp" in [tags] {
dissect { mapping => { "message" => "%{ip} %{?-} %{hostname} %{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}" } }
     date { match => [ "[@metadata][timestamp]", "yyyy-MM-dd HH:mm:ss" ] }
 }
}
mutate { remove_tag => [ "_grokparsefailure" ] }

Let me know if it looks good?

Also please let me know If I can do the above same by only using GROK filter as I dont want to disturb the exisitng GROK logic for other formats of logs

Badger · April 26, 2021, 5:42pm

Yes. If the only thing you want to extract is the timestamp it would be something like

grok {
    pattern_definitions => { "OTHER_TIMESTAMP" => "%{MONTHDAY}/%{MONTH}/%{YEAR}:%{TIME} %{ISO8601_TIMEZONE}" }
    match => { "message" => (^%{TIMESTAMP_ISO8601:[@metadata][timestamp]}|\[%{OTHER_TIMESTAMP:[@metadata][timestamp]})"
}

kommineni24 · April 26, 2021, 8:19pm

10.67.200.249 - name.company.com [22/Mar/2021:22:00:00 -0600] "RCPT -? TO:<carlo.jimenez@company.com> SMTP" 250 41

I am using the below dissect pattern for logline above

%{ip} %{?-} %{hostname} %{[@metadata][timestamp]} %{+[@metadata][timestamp]} %{}

I am able to print the values but my timestamp is coming in brackets, is there a way I can remove the brackets.

Attached output from dissect-tester.

Badger · April 26, 2021, 8:25pm

Use them as delimiters for dissect

%{ip} %{?-} %{hostname} [%{[@metadata][timestamp]}] %{}

system · May 24, 2021, 8:26pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Replace @timestamp with actual timestamp from log file Logstash	3	285	May 20, 2021
Replace @timestamp or add new field with time from log file Logstash	4	501	January 6, 2020
Another @timestamp question... non-standard timestamp transformation Logstash	4	468	September 25, 2019
Replacing @timestamp correctly Logstash	9	461	March 13, 2019
Replacing @timestamp with log timestamp Logstash	3	881	October 7, 2017

Replace @timestamp with actual timestamp from log file with two different formats

Related topics