Logstash/ES date format issue

I have been playing with logstash filters, receiving log records with this format:

2015-11-05 17:37:01 This is a log message

Using this filter:

grok {
      match => { "message" => "%{DATE:date} %{TIME:time} %{GREEDYDATA:msg}" }
}

The format of dates in ES is like this: yy-mm-dd (the year part lost the first 2 digits i.e. 15-11-05). If I try to run a query using format yyyy-mm-dd for dates I can't get any record.

But using this filter:

grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:date} %{GREEDYDATA:msg}" }
}

ES supports queries using the format: yyyy-mm-dd (the year part didn't lose any digit)

Is there any way to use DATE type for a logstash filter field without losing any digit in the year part?

If you inspect the definition of the DATE pattern you'll note that it matches either mm-dd-yy or dd-mm-yy dates. Because you didn't anchor your expression to the beginning of the line by preceding it with ^ it just skipped the century and matched against "15-11-05", which happens to match either mm-dd-yy, dd-mm-yy, yy-mm-dd, or yy-dd-mm.

match => { "message" => "%{TIMESTAMP_ISO8601:date %{GREEDYDATA:msg}" }

That expression is missing a brace after "date".

Is there any way to use DATE type for a logstash filter field without losing any digit in the year part?

No, the DATE pattern only matches two-digits years. You'll have to use something else in your grok expression. What's wrong with TIMESTAMP_ISO8601?

The missing brace was a typo. Sorry.

About the TIMESTAMP_ISO8601, I don't want to use it because it makes harder to define some specific queries that I need.
In example: I need to create three independent filters: one for a days range, one for the hours and other for minutes. I need to resolve queries like this:
Get all the records available between 2015-11-01 and 2015-11-30, but only those which happened between this range of hours 10 - 11 and only between minutes 0 and 15.

If I can split date and time I have a chance to make those filters. But if everything is stored in just one variable of type TIMESTAMP_ISO8601, how could I?

I like the idea of using "^" to avoid losing the century part. I will give it a try.

Get all the records available between 2015-11-01 and 2015-11-30, but only those which happened between this range of hours 10 - 11 and only between minutes 0 and 15.

With a scripted field the uses Groovy this should be possible. See Kibana 4 Beta 3: Now More Filtery | Elastic Blog.

I like the idea of using "^" to avoid losing the century part.

As long as you keep using the DATE pattern it won't make a difference.

Groovy is a great resource, but I will have to handle millions of records. Quoting from Zachary Tong:

The problem with using a script in your search is that it scales linearly to the number of documents that need to be evaluated. So if you are matching 1bn docs...it has to run that script 1bn times. And scripts are a fair amount slower than native functionality, since it has to boot up a Groovy interpreter for each execution.

Groovy is not an option for me :pensive:

Hi @xtingray
Did you find a solution?

@supermario9, please start a new thread for your unrelated problem.