Generate @timestamp in-logstash-by-concatenating-date-from-filename-and-time-from-logs


(Shaminder) #1

Hi,

I have log file name 'log.20170410.123146.txt' and time in log entry. I want to merge date from filename and time from logs to generate @timestamp in logstash.

     grok {
             match => ["message", "^(?<timestamp_tmp>\d{4}\d{2}\d{2} \d{2}\:\d{2}\:\d{2})"]
             match => ["path", "^\/var\/logs\/sitecore\/log\.(?<date_tmp>\d{4}\d{2}\d{2})"]
      }
 mutate {
            add_field => {"timestamp" => "%{date_tmp} %{timestamp_tmp}"
        }
    grok {
            match => ["filename", "(?[%{YEAR}%{MONTHNUM}%{MONTHDAY}])"]
            }
 date {
            match => ["temptimestamp" , "[yyyyMMdd][HH:mm:ss]"]
            target => "filetimestamp"
      }
 mutate {
           remove_field => ["timestamp_tmp", "date_tmp"]
       }

Still not getting the desired results. @timestamp coming after parsing:
@timestamp: %{YEAR}%{MONTHNUM}%{MONTHDAY}


(Magnus Bäck) #2

You have several issues here.

  • I suspect you'll have to split your first grok filter into two filters to make sure both are processed. I don't know exactly how it behaves when you have two match options. Verify that you get both the timestamp_tmp and date_tmp fields.
  • What's the point of your second grok filter? You've already captured the date and time. Also, it's not even capturing anything. Remove.
  • Your date filter parses the temptimestamp field but you have no such field. You should be parsing timestamp instead.
  • Your date filter's pattern doesn't match what's in timestamp.
  • If you want to update the @timestamp field you need to drop target => "filetimestamp in your date filter.

(Shaminder) #3

Hi Magnus,

Thanks for your reply. I have modified the configuration as per your suggestions. The updated Configuration is:

filter {

 if [type] == "sitecore_log" {
        grok {
         patterns_dir => ["/etc/logstash/conf.d/patterns"]
         match => [ "message","%{INT:pid} %{TIME:time} %{LOGLEVEL:loglevel} %{GREEDYDATA:sitecore_log}" ]
             }
        grok {
        match => ["path", "(%{YEAR:year}%{MONTHNUM:month}%{MONTHDAY:day}\.%{HOUR:hour}%{MINUTE:minute}%{SECOND:second}\.txt$)"]
        add_field => ["timestamp", "%{year}/%{month}%/{day} %{TIME}"]
         }
        mutate {
        remove_field => ["year", "month", "day", "hour", "minute", "second"]
                }
        date {
        match => ["timestamp", "yyyy/MM/dd HH:mm:ss"]
        target => "@timestamp"
        }
}
}

Log has been parsed with tag:
"tags": [
"beats_input_codec_plain_applied",
"_grokparsefailure"

And its taking the @timestamp of logstash instead of taking it from filename.
"@timestamp": "2017-04-20T08:10:14.973Z",


(Magnus Bäck) #4

One of your grok filters are broken. Please show an example event, preferably by copying/pasting from the JSON tab in Kibana.


(Shaminder) #5

Hi Magnus,

Here is the event:

{
  "_index": "filebeat-2017.04.20",
  "_type": "sitecore_log",
  "_id": "AVuKyjufwNDRIOvARZcT",
  "_score": null,
  "_source": {
    "message": "6508 12:31:48 INFO  Heartbeat - Worker thread started",
    "@version": "1",
    "@timestamp": "2017-04-20T09:54:51.536Z",
    "source": "/app/logs/sitecore/log.20170410.123146.txt",
    "offset": 54,
    "type": "sitecore_log",
    "input_type": "log",
    "beat": {
      "name": "delvmplssmdo3.sapient.com",
      "hostname": "delvmplssmdo3.sapient.com",
      "version": "5.3.0"
    },
    "host": "delvmplssmdo3.sapient.com",
    "tags": [
      "beats_input_codec_plain_applied",
      "_grokparsefailure"
    ],
    "pid": "6508",
    "time": "12:31:48",
    "loglevel": "INFO",
    "sitecore_log": " Heartbeat - Worker thread started"
  },
  "fields": {
    "@timestamp": [
      1492682091536
    ]
  },
  "sort": [
    1492682091536
  ]
}

(Magnus Bäck) #6

Your second grok filter parses the path field but there is no such field. You should parse the source field.


(Shaminder) #7

Hi Magnus,

Thanks for pointing this out. I am learning grok now. Using source field i am able to get the date.

But now i am having issue while parsing the log.
Here is the log:
Failed action. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2017.04.10", :_type=>"sitecore_log", :_routing=>nil}, #<LogStash::Event:0x6f8ede2c @metadata_accessors=#<LogStash::Util::Accessors:0x68a1217 @store={"type"=>"sitecore_log", "beat"=>"filebeat"}, @lut={"[type]"=>[{"type"=>"sitecore_log", "beat"=>"filebeat"}, "type"], "[beat]"=>[{"type"=>"sitecore_log", "beat"=>"filebeat"}, "beat"]}>, @cancelled=false, @data={"message"=>"6508 12:31:48 INFO Heartbeat - Worker thread started", "@version"=>"1", "@timestamp"=>"2017-04-10T07:01:48.000Z", "beat"=>{"version"=>"5.3.0", "name"=>"delvmplssmdo3.sapient.com", "hostname"=>"delvmplssmdo3.sapient.com"}, "source"=>"/app/logs/sitecore/log.20170410.123146.txt", "offset"=>54, "type"=>"sitecore_log", "input_type"=>"log", "host"=>"delvmplssmdo3.sapient.com", "tags"=>["beats_input_codec_plain_applied"], "pid"=>"6508", "time"=>"12:31:48", "loglevel"=>"INFO", "sitecore_log"=>" Heartbeat - Worker thread started", "timestamp"=>"2017/04/10 12:31:48"}, @metadata={"type"=>"sitecore_log", "beat"=>"filebeat"}, @accessors=#<LogStash::Util::Accessors:0x6a81d87e @store={"message"=>"6508 12:31:48 INFO Heartbeat - Worker thread started", "@version"=>"1", "@timestamp"=>"2017-04-10T07:01:48.000Z", "beat"=>{"version"=>"5.3.0", "name"=>"delvmplssmdo3.sapient.com", "hostname"=>"delvmplssmdo3.sapient.com"}, "source"=>"/app/logs/sitecore/log.20170410.123146.txt", "offset"=>54, "type"=>"sitecore_log", "input_type"=>"log", "host"=>"delvmplssmdo3.sapient.com", "tags"=>["beats_input_codec_plain_applied"], "pid"=>"6508", "time"=>"12:31:48", "loglevel"=>"INFO", "sitecore_log"=>" Heartbeat - Worker thread started", "timestamp"=>"2017/04/10 12:31:48"}, @lut={"@timestamp"=>[{"message"=>"6508 12:31:48 INFO Heartbeat - Worker thread started", "@version"=>"1", "@timestamp"=>"2017-04-10T07:01:48.000Z", "beat"=>{"version"=>"5.3.0", "name"=>"delvmplssmdo3.sapient.com", "hostname"=>"delvmplssmdo3.sapient.com"}, "source"=>"/app/logs/sitecore/log.20170410.123146.txt", "offset"=>54, "type"=>"sitecore_log", "input_type"=>"log", "host"=>"delvmplssmdo3.sapient.com", "tags"=>["beats_input_codec_plain_applied"], "pid"=>"6508", "time"=>"12:31:48", "loglevel"=>"INFO", "sitecore_log"=>" Heartbeat - Worker thread started", "timestamp"=>"2017/04/10 12:31:48"}, "@timestamp"]}>>], :response=>{"index"=>{"_index"=>"filebeat-2017.04.10", "_type"=>"sitecore_log", "_id"=>"AVuK8QzJwNDRIOvARbFQ", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [timestamp]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2017/04/10 12:31:48\" is malformed at \"/04/10 12:31:48\""}}}}, :level=>:warn, :file=>"logstash/outputs/elasticsearch/common.rb", :line=>"119", :method=>"submit"}


(Shaminder) #8

Hi Magnus,

Thanks for all your help.

I am able to parse the date and concatenate timestamp from logs now.

Cheers :slight_smile:


(Shaminder) #9

Hi Magnus,

I am facing issue when parsing multiple files to fetch timestamp from filename.

My filename is in this format:
MyCustomLogFileAppender_07.05.2017.log
MyCustomLogFileAppender_08.05.2017.log
MyCustomLogFileAppender_08.05.2017.log.1

Filebeat config:

-
  paths:
    - /app/logs/sitecore/MyCustomLogFileAppender_*
  input_type: log
  document_type: sitecore_log
  multiline:
  pattern: '^[[:space:]]+|^[[:space:]]+at+|^Exception:|^Source:|^Nested+|^Message:|^D:'
  negate: false
  match: after

Below is my logstash config:

if [type] == "sitecore_log" {
multiline {
patterns_dir => "/etc/logstash/conf.d/patterns"
pattern => "^%{INT:pid} %{TIME:time}"
negate => true
what => previous
max_age => 10
}
grok {
patterns_dir => "/etc/logstash/conf.d/patterns"
match => [ "message", "(%{INT:pid}|%{WORD:heartbeat}) %{TIME:time} %{LOGLEVEL:loglevel} (%{GREEDY_DATA:sitecore_log}|%{GREEDY_DATA:rewrite_log})" ]
}
grok {
match => ["source", "(_%{MONTHDAY:day}.%{MONTHNUM:month}.%{YEAR:year})"]
add_field => ["timestamp", "%{day}-%{month}-%{year} %{time}"]
}
mutate {
remove_field => ["day", "month", "year"]
}
date {
match => ["timestamp", "dd-MM-yyy HH:mm:ss.SSSZ", "dd-MM-yyyy HH:mm:ss"]
target => "@timestamp"
}

I am getting multiple timestamp in kibana:

{
"_index": "filebeat-2017.05.11",
"_type": "sitecore_log",
"_id": "AVv3hNwFqdh28cPIV0fK",
"_score": null,
"_source": {
"message": "3512 00:04:24 DEBUG [BucketLinkManager] - Method started :5/9/2017 12:04:24 AM\n 924 09:46:09 DEBUG BucketItemResolver Start",
"@version": "1",
"@timestamp": "2017-05-11T12:38:21.131Z",
"offset": [
2645811,
2256146
],
"type": "sitecore_log",
"input_type": "log",
"beat": {
"name": "delvmplssmdo3.sapient.com",
"hostname": "delvmplssmdo3.sapient.com",
"version": "5.3.0"
},
"source": [
"/app/logs/sitecore/MyCustomLogFileAppender_09.05.2017.log.3",
"/app/logs/sitecore/MyCustomLogFileAppender_08.05.2017.log.4"
],
"host": "delvmplssmdo3.sapient.com",
"tags": [
"beats_input_codec_plain_applied",
"multiline",
"_dateparsefailure"
],
"pid": "3512",
"time": "00:04:24",
"loglevel": "DEBUG",
"sitecore_log": "[BucketLinkManager] - Method started :5/9/2017 12:04:24 AM\n 924 09:46:09 DEBUG BucketItemResolver Start",
"timestamp": "09,08-05,05-2017,2017 00:04:24"
},
"fields": {
"@timestamp": [
1494506301131
]
},
"highlight": {
"type.raw": [
"@kibana-highlighted-field@sitecore_log@/kibana-highlighted-field@"
],
"type": [
"@kibana-highlighted-field@sitecore_log@/kibana-highlighted-field@"
]
},
"sort": [
1494506301131
]
}

Can you please suggest something to resolve this. I am using multiline filter to parse the multiline stacktrace.


(Magnus Bäck) #10

Why are you using multiline both in Filebeat and Logstash? What do you mean with "multiple timestamps"?


(Shaminder) #11

Hi Magnus,

My log event does not contains datestamp, datestamp is being generated in the filename.(MyCustomLogFileAppender_07.05.2017.log). My log event looks like this:

3512 00:04:24 DEBUG [BucketLinkManager] - Method started :5/9/2017 12:04:24 AM

So i want the datestamp to be taken from filename and time from event log to generate @timestamp.

Its working for a single log file, but when i am trying to parse multiple log files the timestamp being generated like:

"timestamp": "09,08-05,05-2017,2017 00:04:24"

I am using multiline in filebeat and logstash to parse the multiline stack trace. Please suggest if that could be done by using multiline in single logstash configuration.


(Magnus Bäck) #12

I am using multiline in filebeat and logstash to parse the multiline stack trace. Please suggest if that could be done by using multiline in single logstash configuration.

Do it on the Filebeat side. I don't understand why you'd need to have multiline processing in two places.


(Shaminder) #13

Hi Magnus,

Greetings for the Day!!
As suggested, i have removed the multiline config from logstash and placed it on filebeat.:

  paths:
    - /app/logs/sitecore/MyCustomLogFileAppender_*
  input_type: log
  document_type: sitecore_log
  multiline:
  pattern: '^[[:digit:]]{4}[[:space:]][[:digit:]]{2}\:[[:digit:]]{2}\:[[:digit:]]{2}'
  negate: true
  match: previous

Logstash config:

if [type] == "sitecore_log" {
grok {
patterns_dir => "/etc/logstash/conf.d/patterns"
match => [ "message", "(%{INT:pid}|%{WORD:heartbeat}) %{TIME:time} %{LOGLEVEL:loglevel} (%{GREEDY_DATA:sitecore_log}|%{GREEDY_DATA:rewrite_log})" ]
}
grok {
match => ["source", "(_%{MONTHDAY:day}.%{MONTHNUM:month}.%{YEAR:year})"]
add_field => ["timestamp", "%{day}-%{month}-%{year} %{time}"]
}
mutate {
remove_field => ["day", "month", "year"]
}
date {
match => ["timestamp", "dd-MM-yyy HH:mm:ss.SSSZ", "dd-MM-yyyy HH:mm:ss"]
target => "@timestamp"
}
}

Log Event:

3512 05:11:00 ERROR Application Error
Exception: SolrNet.Exceptions.SolrConnectionException
Message: <?xml version="1.0" encoding="UTF-8"?>

Source: SolrNet
at SolrNet.Impl.SolrConnection.Get(String relativeUrl, IEnumerable1 parameters) at SolrNet.Impl.SolrQueryExecuter1.Execute(ISolrQuery q, QueryOptions options)
at EuropeanTour.Repository.Implementation.SolrRepository1.GetArticleFromItemId(String articleItemId, Int32 start, Int32 numberOfArticle, Int32& numFound) at EuropeanTour.Repository.Implementation.Global.ArticlesRepository.GetMediaList(String datasource, String variation, Int32 start, Int32 recordsToFetch, Boolean isAjax, Nullable1 advertisementIndex, String currentItemID)
at EuropeanTour.Controllers.Europeantour.ArticlesController.GetMediaList()

Nested Exception

Exception: System.Net.WebException
Message: The remote server returned an error: (400) Bad Request.
Source: System
at System.Net.HttpWebRequest.GetResponse()
at HttpWebAdapters.Adapters.HttpWebRequestAdapter.GetResponse()
at SolrNet.Impl.SolrConnection.GetResponse(IHttpWebRequest request)
at SolrNet.Impl.SolrConnection.Get(String relativeUrl, IEnumerable`1 parameters)

Log Event:

{
"_index": "filebeat-2017.05.12",
"_type": "sitecore_log",
"_id": "AVv9sgd4qdh28cPIquG3",
"_score": null,
"_source": {
"message": " at EuropeanTour.Repository.Implementation.TournamentRepository.GetTournamentStatus(Int32 tournamentId)",
"@version": "1",
"@timestamp": "2017-05-12T17:25:31.376Z",
"source": "/app/logs/sitecore/MyCustomLogFileAppender_08.05.2017.log.2",
"offset": 61500,
"type": "sitecore_log",
"input_type": "log",
"beat": {
"name": "delvmplssmdo3.sapient.com",
"hostname": "delvmplssmdo3.sapient.com",
"version": "5.3.0"
},
"host": "delvmplssmdo3.sapient.com",
"tags": [
"beats_input_codec_plain_applied",
"_grokparsefailure",
"_dateparsefailure"
],
"timestamp": "08-05-2017 %{time}"
},
"fields": {
"@timestamp": [
1494609931376
]
},
"highlight": {
"type.raw": [
"@kibana-highlighted-field@sitecore_log@/kibana-highlighted-field@"
],
"type": [
"@kibana-highlighted-field@sitecore_log@/kibana-highlighted-field@"
]
},
"sort": [
1494609931376
]
}

I am trying to achieve two things here:

  1. Need to generate @timestamp by concatenating date from filename and time from log event.
  2. Parse multiline stacktrace in logstash.

(Magnus Bäck) #14

Your multiline configuration isn't working so let's start there. As documented, the match option should be efter "before" or "after". Not "previous".


(Shaminder) #15

Hi Magnus,

Thanks for your reply.
As suggested, i have modified the filebeat config and replaced 'previous' with 'before' and negate from 'true' to 'false'. Config now looks like:

-
  paths:
    - /app/logs/sitecore/MyCustomLogFileAppender_*
  input_type: log
  document_type: sitecore_log
  multiline:
  pattern: '^[[:digit:]]{4}[[:space:]][[:digit:]]{2}\:[[:digit:]]{2}\:[[:digit:]]{2}'
  negate: false
  match: before

It seems my multiline filter is not working.
I have tried with below config as well using different pattern:

  multiline:
  pattern: '^[[:space:]]+|^[[:space:]]+at+|^Exception:|^Source:|^Nested+|^Message:|^D:'
  negate: true
  match: before

Still it does'nt seems to be working.
I am using filebeat version: filebeat version 5.3.0

Can you suggest if there is any syntax config changes for this particular version.


(Magnus Bäck) #16

The first thing you attempted is closer to what you need. Surely negate should be true, i.e. if the line does not start with the timestamp it should be joined with the preceding line?


(Shaminder) #17

Hi Magnus,

Yes, in my case as you stated. negate should be true,
i.e. if the line does not start with the timestamp it should be joined with the preceding line (That's Correct)


(Shaminder) #18

Hi Magnus,

After making several attempts to make it work, I have this configuration that somehow worked.

-
  paths:
    - /app/logs/sitecore/MyCustomLogFileAppender*
  input_type: log
  document_type: sitecore_log
  multiline.pattern: '[:digit:]]{3,4}[[:space:]][[:digit:]]{2}\:[[:digit:]]{2}\:[[:digit:]]{2}[[:space:]][[:alpha:]]{4,5}'
  multiline.negate: true
  multiline.match: after

Now, the issue is: in message field its taking multiple log events under single event:

I am pasting a small portion of log event, because the actual log event is very large:

{
"_index": "filebeat-2017.05.06",
"_type": "sitecore_log",
"_id": "AVwMC_nrqdh28cPI1bVE",
"_score": null,
"_source": {
"message": "2044 02:15:52 DEBUG CustomItemResolver Start\n2892 02:15:52 DEBUG CustomItemResolver Start\n3076 02:15:52 DEBUG CustomItemResolver Start\n3200 02:15:52 DEBUG CustomItemResolver Start\n3076 02:15:52 DEBUG LanguageItemResolver Start\n2044 02:15:52 DEBUG LanguageItemResolver Start\n2892 02:15:52 DEBUG LanguageItemResolver Start\n3076 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n2044 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n2892 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n3200 02:15:52 DEBUG LanguageItemResolver Start\n3200
"@version": "1",
"@timestamp": "2017-05-06T20:45:52.000Z",
"source": "/app/logs/sitecore/MyCustomLogFileAppender_07.05.2017.log",
"offset": 31300055,
"type": "sitecore_log",
"input_type": "log",
"beat": {
"name": "delvmplssmdo3.sapient.com",
"hostname": "delvmplssmdo3.sapient.com",
"version": "5.3.0"
},
"host": "delvmplssmdo3.sapient.com",
"tags": [
"beats_input_codec_plain_applied"
],
"pid": "2044",
"time": "02:15:52",
"loglevel": "DEBUG",
"sitecore_log": "CustomItemResolver Start\n2892 02:15:52 DEBUG CustomItemResolver Start\n3076 02:15:52 DEBUG CustomItemResolver Start\n3200 02:15:52 DEBUG CustomItemResolver Start\n3076 02:15:52 DEBUG LanguageItemResolver Start\n2044 02:15:52 DEBUG LanguageItemResolver Start\n2892 02:15:52 DEBUG LanguageItemResolver Start\n3076 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n2044 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n2892 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n3200 02:15:52 DEBUG LanguageItemResolver Start\n3200 02:15:52 DEBUG LanguageItemResolver Context Item is Empty and Language Code is not Correct\n3076 02:15:52 DEBUG BucketItemResolver Start\n
"timestamp": "07-05-2017 02:15:52"
},
"fields": {
"@timestamp": [
1494103552000
]
},
"highlight": {
"type.raw": [
"@kibana-highlighted-field@sitecore_log@/kibana-highlighted-field@"
],
"type": [
"@kibana-highlighted-field@sitecore_log@/kibana-highlighted-field@"
]
},
"sort": [
1494103552000
]
}


(Magnus Bäck) #19

You need one more square bracket at the beginning of the expression ([[:digit:]], not [:digit:]]). You'll also want to begin your regexp with ^.


(Shaminder) #20

Hi Marcus,

Good Morning,

Yes, you are correct. I missed that square bracket in the beginning. I have made some tweaks in the config to make it working. Finally, the working config is:

  paths:
    - /app/logs/sitecore/MyCustomLogFileAppender*
  input_type: log
  document_type: sitecore_log
  multiline.pattern: '[[:digit:]]{3,4}[[:space:]][[:digit:]]{2}\:[[:digit:]]{2}\:[[:digit:]]{2}[[:space:]][[:alpha:]]{4,5}'
  multiline.negate: true
  multiline.match: after

Also, in logstash logs i am getting below error:

{:timestamp=>"2017-05-16T10:34:26.667000+0530", :message=>"A plugin had an unrecoverable error. Will restart this plugin.\n Plugin: <LogStash::Inputs::Beats port=>5671, ssl=>false, codec=><LogStash::Codecs::Plain charset=>"UTF-8">, host=>"0.0.0.0", ssl_verify_mode=>"none", include_codec_tag=>true, ssl_handshake_timeout=>10000, congestion_threshold=>5, target_field_for_codec=>"message", tls_min_version=>1, tls_max_version=>1.2, cipher_suites=>["TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384", "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384", "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256"], client_inactivity_timeout=>60>\n Error: event executor terminated", :level=>:error}

Kindly suggest if i can ignore it or anything needs to be taken care.

Versions i am using:
Filebeat: 5.3.0
Logstash: 2.4.1