Logstash/ES date format issue

xtingray · November 5, 2015, 10:35pm

I have been playing with logstash filters, receiving log records with this format:

2015-11-05 17:37:01 This is a log message

Using this filter:

grok {
      match => { "message" => "%{DATE:date} %{TIME:time} %{GREEDYDATA:msg}" }
}

The format of dates in ES is like this: yy-mm-dd (the year part lost the first 2 digits i.e. 15-11-05). If I try to run a query using format yyyy-mm-dd for dates I can't get any record.

But using this filter:

grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:date} %{GREEDYDATA:msg}" }
}

ES supports queries using the format: yyyy-mm-dd (the year part didn't lose any digit)

Is there any way to use DATE type for a logstash filter field without losing any digit in the year part?

magnusbaeck · November 6, 2015, 6:50am

If you inspect the definition of the DATE pattern you'll note that it matches either mm-dd-yy or dd-mm-yy dates. Because you didn't anchor your expression to the beginning of the line by preceding it with ^ it just skipped the century and matched against "15-11-05", which happens to match either mm-dd-yy, dd-mm-yy, yy-mm-dd, or yy-dd-mm.

match => { "message" => "%{TIMESTAMP_ISO8601:date %{GREEDYDATA:msg}" }

That expression is missing a brace after "date".

Is there any way to use DATE type for a logstash filter field without losing any digit in the year part?

No, the DATE pattern only matches two-digits years. You'll have to use something else in your grok expression. What's wrong with TIMESTAMP_ISO8601?

xtingray · November 6, 2015, 3:16pm

The missing brace was a typo. Sorry.

About the TIMESTAMP_ISO8601, I don't want to use it because it makes harder to define some specific queries that I need.
In example: I need to create three independent filters: one for a days range, one for the hours and other for minutes. I need to resolve queries like this:
Get all the records available between 2015-11-01 and 2015-11-30, but only those which happened between this range of hours 10 - 11 and only between minutes 0 and 15.

If I can split date and time I have a chance to make those filters. But if everything is stored in just one variable of type TIMESTAMP_ISO8601, how could I?

I like the idea of using "^" to avoid losing the century part. I will give it a try.

magnusbaeck · November 6, 2015, 3:28pm

Get all the records available between 2015-11-01 and 2015-11-30, but only those which happened between this range of hours 10 - 11 and only between minutes 0 and 15.

With a scripted field the uses Groovy this should be possible. See Kibana 4 Beta 3: Now More Filtery | Elastic Blog.

I like the idea of using "^" to avoid losing the century part.

As long as you keep using the DATE pattern it won't make a difference.

xtingray · November 6, 2015, 3:32pm

Groovy is a great resource, but I will have to handle millions of records. Quoting from Zachary Tong:

The problem with using a script in your search is that it scales linearly to the number of documents that need to be evaluated. So if you are matching 1bn docs...it has to run that script 1bn times. And scripts are a fair amount slower than native functionality, since it has to boot up a Groovy interpreter for each execution.

Groovy is not an option for me

yahoo · April 16, 2016, 7:57pm

Hi @xtingray
Did you find a solution?

magnusbaeck · June 21, 2016, 8:24am

@supermario9, please start a new thread for your unrelated problem.

Topic		Replies	Views
Date format for YYYY-mm-dd HH:mm:ss,SSS? Logstash	11	62044	May 2, 2017
Date format in elasticsearch Elasticsearch	6	692	July 5, 2017
Simple Grok and Date filter problem Logstash	5	2129	March 18, 2019
Dealing with string date Logstash	5	263	March 25, 2020
Issue with date format Logstash	8	1282	December 25, 2019

Logstash/ES date format issue

Related topics