The format of dates in ES is like this: yy-mm-dd (the year part lost the first 2 digits i.e. 15-11-05). If I try to run a query using format yyyy-mm-dd for dates I can't get any record.
If you inspect the definition of the DATE pattern you'll note that it matches either mm-dd-yy or dd-mm-yy dates. Because you didn't anchor your expression to the beginning of the line by preceding it with ^ it just skipped the century and matched against "15-11-05", which happens to match either mm-dd-yy, dd-mm-yy, yy-mm-dd, or yy-dd-mm.
match => { "message" => "%{TIMESTAMP_ISO8601:date %{GREEDYDATA:msg}" }
That expression is missing a brace after "date".
Is there any way to use DATE type for a logstash filter field without losing any digit in the year part?
No, the DATE pattern only matches two-digits years. You'll have to use something else in your grok expression. What's wrong with TIMESTAMP_ISO8601?
About the TIMESTAMP_ISO8601, I don't want to use it because it makes harder to define some specific queries that I need.
In example: I need to create three independent filters: one for a days range, one for the hours and other for minutes. I need to resolve queries like this:
Get all the records available between 2015-11-01 and 2015-11-30, but only those which happened between this range of hours 10 - 11 and only between minutes 0 and 15.
If I can split date and time I have a chance to make those filters. But if everything is stored in just one variable of type TIMESTAMP_ISO8601, how could I?
I like the idea of using "^" to avoid losing the century part. I will give it a try.
Get all the records available between 2015-11-01 and 2015-11-30, but only those which happened between this range of hours 10 - 11 and only between minutes 0 and 15.
Groovy is a great resource, but I will have to handle millions of records. Quoting from Zachary Tong:
The problem with using a script in your search is that it scales linearly to the number of documents that need to be evaluated. So if you are matching 1bn docs...it has to run that script 1bn times. And scripts are a fair amount slower than native functionality, since it has to boot up a Groovy interpreter for each execution.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.