Parsing a crappy date format


(Aviv Ratzon) #1

Hi everyone,
I have a real problem that I couldn't find a decent solution to.
I need to parse log files and extract dates inside these files. the date format is really crappy and looks like this:

10/09/15 13:55:57:2020
DD/MM/YY HH:MM:SS:ssss

What would be a good way to parse it into a timestamp field? The date filter doesn't seem to provide a good solution to this.

thank you very much.


(Magnus Bäck) #2

Except for the absence of a timezone specifier there's nothing crappy about it, and the date filter is a perfect fit for this task.


(Aviv Ratzon) #3

So how come it fails to parse it? am I doing something wrong:

  date
  {
	match => [ "logdate", "dd/MM/YY" ]
  }

I've tried dozens of different configurations with this filter and it simply won't output any time fields.


(Magnus Bäck) #4

If the logdate field indeed contains "10/09/15 13:55:57:2020" you need a date pattern that includes the time. I suppose "dd/MM/YY HH:mm:ss:SSSS" should work.


(Aviv Ratzon) #5

I run this filter:

  date
  {
	match => ["time", "dd/MM/YY HH:mm:ss:SSSS"]
  }

with this input:

15/12/13 05:20:22:3333

and... nothing.
Any idea what the problem is?

thank you very much for your help :slight_smile:


(Magnus Bäck) #6

Works fine for me:

$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  date {
    match => ["message", "dd/MM/YY HH:mm:ss:SSSS"]
  }
}
$ echo 15/12/13 05:20:22:3333 | /opt/logstash/bin/logstash -f test.config
Logstash startup completed
{
       "message" => "15/12/13 05:20:22:3333",
      "@version" => "1",
    "@timestamp" => "2013-12-15T04:20:22.333Z",
          "host" => "lnxolofon"
}
Logstash shutdown completed

(Aviv Ratzon) #7

I tried it and apparently it won't parse it if it doesn't fit 100%. in your example it gives an error that says the string is malformed at "\r" - it simply reads the newline character with the date and gives a date parse failure.
Is there maybe something wrong with the configurations at some yml file? becaus it seems odd that it won't read only the format it is searching for and hand out an error because there are more characters than it expected.

thanks !


(Magnus Bäck) #8

Unlike the grok filter the date filter requires the string to match exactly. The \r character is the CR part of a Windows-style CR/LF linebreak, but that should arguably be ignored by the date filter. You should be able to use the strip option of the mutate filter to remove the \r.


(Aviv Ratzon) #9

Ahhh I see... Okay this is much clearer now.
So one last question and sorry if its dumb: I managed to parse the date and time in the log into a field. The problem is that they are logged as two entries(or two values, I don't know how its called) e.g:

"logdate":["10.09.15","13:56:28:6238"]

How can I make them into a single string so that I could use the date filter on it? I've looked quite a bit and can't seem to find a way to merge them.


(Magnus Bäck) #10

Untested but should at least give you ideas:

mutate {
  replace => ["logdate", "%{[logdate][0]} %{[logdate][1]}"]
}

(Aviv Ratzon) #11

Ah good, I didn't know I could access the entries this way.
Thank you very much for all the great help and quick replies!


(system) #12