Get the right timestamp for old log files

Hi!

I've been trying to get logstash to read old log files for a few days now. I didn't think it worked but the truth is it did. Although the timestamp that kibana 4 makes is the timestamp for when the actual log file is indexed. I need to create a custom timestamp that takes the actual time from a event in the log and shows it. I've started to make a pattern as well as a date. When I tried starting kibana, elasticsearch and logstash it didn't want to turn on and that usually means that I have done the syntax wrong.

Here is what it looks like in my current logstash configuration file.

file {
     path => "/var/externallogs_maven/oldlogs/request.log.2015-06-22"
     type => "nexus-log"
     start_position => "beginning"
  }
}
 filter{
     grok{
        type => "nexus-log"
        patterns_dir => "./config-dir/patterns"
        match => [ "message", "\b\w+\b\s/nexus/content/repositories/(?<repo>[^/]+)", "(?<mytimestamp>%{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND})" ]
     }
     date {
        match => ["mytimestamp", "dd/MMM/YYYY:HH:mm:ss +SSSS"]
        locale => "swe"

There's an input { line at the top of your file that was lost in the copy/paste operation, right? I can see the closing brace just before the opening of the filter block.

If Logstash doesn't start there should be an error message in its log. You can also run it with the --configtest option to test if one or more configuration files are syntactically valid.

Also, "swe" isn't a valid locale. If your logs really use Swedish month names you should use "sv".

Thank you. Yes there is an input at the top. I am not really sure about the "match" part under the grok pattern either. I don't know if you see but I'm actually using one repository pattern that you showed me a few days ago and I'm also using a custom timestamp pattern right after that one. Is that syntax acceptable?

Yeah, it looks good (although your use of type in the grok filter is deprecated; use a conditional instead). As I said, if Logstash doesn't start it'll indicate why. No configuration problems in Logstash will affect ES and Kibana from starting, and your original message indicates that neither will start which indicates that the problem actually lies elsewhere.

Okey because when I remove one of my grok patterns and remove the date part everything starts as it's supposed to.

So there's probably something bad with one of your filters then. Again, what does --configtest (or the Logstash logs) say about your configuration when it's a non-working state?

I'm just gonna add "--configtest" before my next startup of logstash and see what it says. Right now the only thing it says is that "there is something wrong with my configuration" haha. Give me a moment.

This is what it spit out. Apparently there is some invalid setting for my grok filter. es_1 is for elasticsearch, kibana_1 is for kibana 4 and stash_1 is for Logstash.

es_1 | log4j:WARN No appenders could be found for logger (bootstrap).
es_1 | log4j:WARN Please initialize the log4j system properly.
es_1 | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

kibana_1 | {"@timestamp":"2015-07-06T14:09:37.551Z","level":"error","node_env":"production","error":"Request error, retrying -- connect ECONNREFUSED"}
kibana_1 | {"@timestamp":"2015-07-06T14:09:37.554Z","level":"warn","message":"Unable to revive connection: http://172.17.6.100:9200/","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-06T14:09:37.554Z","level":"warn","message":"No living connections","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-06T14:09:37.555Z","level":"info","message":"Unable to connect to elasticsearch at http://172.17.6.100:9200. Retrying in 2.5 seconds.","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-06T14:09:40.099Z","level":"info","message":"Found kibana index","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-06T14:09:40.181Z","level":"info","message":"Listening on 0.0.0.0:5601","node_env":"production"}

stash_1 | {:timestamp=>"2015-07-06T14:09:40.982000+0000", :message=>"Using version 0.1.x input plugin 'syslog'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-06T14:09:40.996000+0000", :message=>"Using version 0.1.x codec plugin 'json'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-06T14:09:41.007000+0000", :message=>"Using version 0.1.x input plugin 'file'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-06T14:09:41.014000+0000", :message=>"Using version 0.1.x codec plugin 'plain'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-06T14:09:41.021000+0000", :message=>"You are using a deprecated config setting "type" set in grok. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. You can achieve this same behavior with the new conditionals, like: if [type] == \"sometype\" { grok { ... } }. If you have any questions about this, please visit the #logstash channel on freenode irc.", :name=>"type", :plugin=><LogStash::Filters::Grok --->, :level=>:warn}
stash_1 | {:timestamp=>"2015-07-06T14:09:41.024000+0000", :message=>"Invalid setting for grok filter plugin:\n\n filter {\n grok {\n # This setting must be a hash\n # This field must contain an even number of items, got 3\n match => ["message", "\\b\\w+\\b\\s/nexus/content/repositories/(?[^/]+)", "(?%{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND})"]\n ...\n }\n }", :level=>:error}
stash_1 | Error: Something is wrong with your configuration.
elksuite_stash_1 exited with code 1

Ah, right. Your grok match pattern declaration must look like this:

match => [
  "message", "\b\w+\b\s/nexus/content/repositories/(?<repo>[^/]+)", 
  "message", "(?<mytimestamp>%{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND})"
]

Never mind the indentation and linebreak changes, that's just for making it more readable. The key is that you need an even number of entries, i.e. your array must be structured like a hash. Or you can use a hash as exemplified by the documentation.

Oh wow really. That seems pretty strict. Thanks for your help my friend.

So I fixed the error message regarding the match part. Now I keep getting a new error I don't' really understand. The error is as follows:

es_1 | log4j:WARN No appenders could be found for logger (bootstrap).
es_1 | log4j:WARN Please initialize the log4j system properly.
es_1 | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
kibana_1 | {"@timestamp":"2015-07-07T07:54:08.962Z","level":"error","node_env":" production","error":"Request error, retrying -- connect ECONNREFUSED"}
kibana_1 | {"@timestamp":"2015-07-07T07:54:08.965Z","level":"warn","message":"Un able to revive connection: http://172.17.6.166:9200/","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-07T07:54:08.965Z","level":"warn","message":"No living connections","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-07T07:54:08.966Z","level":"info","message":"Un able to connect to elasticsearch at http://172.17.6.166:9200. Retrying in 2.5 se conds.","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-07T07:54:11.532Z","level":"info","message":"Fo und kibana index","node_env":"production"}
kibana_1 | {"@timestamp":"2015-07-07T07:54:11.696Z","level":"info","message":"Li stening on 0.0.0.0:5601","node_env":"production"}
stash_1 | {:timestamp=>"2015-07-07T07:54:12.689000+0000", :message=>"Using vers ion 0.1.x input plugin 'syslog'. This plugin isn't well supported by the communi ty and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-07T07:54:12.703000+0000", :message=>"Using vers ion 0.1.x codec plugin 'json'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-07T07:54:12.713000+0000", :message=>"Using vers ion 0.1.x input plugin 'file'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-07T07:54:12.721000+0000", :message=>"Using vers ion 0.1.x codec plugin 'plain'. This plugin isn't well supported by the communit y and likely has no maintainer.", :level=>:info}
stash_1 | {:timestamp=>"2015-07-07T07:54:12.724000+0000", :message=>"You are us ing a deprecated config setting "type" set in grok. Deprecated settings will c ontinue to work, but are scheduled for removal from logstash in the future. You can achieve this same behavior with the new conditionals, like: if [type] == \" sometype\" { grok { ... } }. If you have any questions about this, please visit the #logstash channel on freenode irc.", :name=>"type", :plugin=><LogStash::Fil ters::Grok --->, :level=>:warn}
stash_1 | {:timestamp=>"2015-07-07T07:54:12.815000+0000", :message=>"Using vers ion 0.1.x output plugin 'elasticsearch'. This plugin isn't well supported by the community and likely has no maintainer.", :level=>:info}
stash_1 | Configuration OK

I really don't know if I need the locale => "sv" command either. The reason I put it there is because I want it to take the appropriate time from the event in the old log itself and not the time I index the log which of course would be the current time.

I've been searching in the logstash config but I can't find any syntax errors. It seems to have something to do with my current output elasticsearch version. This is what my whole logstash config looks like right now.

input {
      syslog {
        port => 5514
        codec => "json"
      }
      file {
         path => "/var/externallogs_maven/oldlogs/request.log.2015-06-22"
         type => "nexus-log"
         start_position => "beginning"
     }
    
    
    }
filter {
    
       grok {
    
         type => "nexus-log"
         patterns_dir => "./config-dir/patterns"
         match => [
            "message", "\b\w+\b\s/nexus/content/repositories/(?<repositories>[^/]+)",
            "message", "(?<mytimestamp>%{MONTHDAY}/%{MONTH}/%{YEAR}:%{HOUR}:%{MINUTE}:%{SECOND})"
    
          ]
       }
    
    
    date {
       match => ["mytimestamp", "dd/MMM/YYYY:HH:mm:ss +SSSS"]
       locale => "sv"
    }

}
    output {
      elasticsearch {
        host => es
        port => 9300
        cluster => "elks"
        protocol => "transport"
      }
    }

I can't see any Logstash error message in the log snippet you posted. What makes you think things aren't working?

I really don't know if I need the locale => "sv" command either. The reason I put it there is because I want it to take the appropriate time from the event in the old log itself and not the time I index the log which of course would be the current time.

Again, unless you have Swedish month names in your logs you don't need it.

The reason I think something is wrong because it says "gracefully stopping" when I try to start it up. I went into my start up configuration and removed "--verbose" and "--configtest" and new error message instead which was called that was:

{:timestamp=>"2015-07-07T08:29:02.918000+0000", :message=>"Failed parsing date from field", :field=>"mytimestamp", :value=>"22/Jun/2015:04:26:23", :exception=>java.lang.IllegalArgumentException: Invalid format: "22/Jun/2015:04:26:23" is malformed at "Jun/2015:04:26:23", :level=>:warn}

I guess this means I've done something wrong when it comes to the format somewhere.

Unless you've already removed locale => "sv" the problem is that you're trying to parse English month names with a Swedish locale.

Hi yes that seemed to solve the problem. I don't know if I am thinking in the right terms when it comes to the visualizing part. I managed to get out the right date in the right format which is "22/Jun/2015:13:35:54" and it feels really great. Although it might not fill it's purpose I had in store all along. The thing is I am trying to use the field "mytimestamp" as a term on the X axel in the line chart but when I try that it divides the string in total so there are different points like "22", "Jun", "2015", "13", "35" and so on. I guess it take delimiters like "/" and ":" in account when it does the divide?

Just use the date filter to parse string timestamps (your mytimestamp field) into the standard @timestamp field.

Thank you. Yes there is an input at the top. I am not really sure about the "match" part under the grok pattern either. I don't know if you see but I'm actually using one repository pattern that you showed me a few days ago and I'm also using a custom timestamp pattern right after that one. Is that syntax acceptable? by yseo

What do you add under the date filter to be able to parse a regular string onto the original timestamp? I have looked around under the date filter but I don't seem to find something appropriate.

I found it. It was well hidden. It's the "target" one right? Although they don't show the syntax for it.

The target parameter is indeed used to select the destination field, i.e. the name of the field where the resulting timestamp is stored. You don't have to touch it since it defaults to @timestamp which is what you should be using.