The modification consists on swapping the order of the status and origin fields.
The new filter has been thoroughly tested on the debugger, but is not working "on the field".
After some debugging, I could find that:
The original filter does not drop the log, even if the match fails
Any slight modification of the filter results in a drop
I can get a matched log by using a filter for the fields up to the timestamp, none after, like ^\[%{DATA:dummy}\] %{LOGLEVEL:my.loglevel} \{%{DATA:my.source}\} - timestamp = %{DATA:timestamp}
I can avoid the drop by commenting out the timezone in the date section, and get logs with and without grok match
Could anyone provide a hint on this weird behavior?
In particular:
Why the log is being dropped and the error silenced
Can you share your entire pipeline and a sample message?
Which filter you changed? The grok? the mutate? the Date filter?
If commenting the timezone option in the date filter fix your issue, this could mean that your logs are not being dropped, but being ingested with a wrong date and you may not be using a wide enough date range in the time filter to see it.
Can you add an extra output to write your logs to a file and see if they show up there?
Thanks @leandrojmp, this was the problem.
Well, this and my misunderstanding on how the timezone is working.
So, examining the different timestamps I now see that:
default @timestamp is shown in my local time (UTC+1), e.g. Nov 30, 2022 @ 21:22:57.480
the timestamp I'm parsing is already in UTC, so 2022-11-30T20:22:57.472
when adding the timezone I'm creating a new field which does the opposite of what I though. Instead of showing me the parsed message in my timezone, it shows the parsed ts in UTC assuming the original message was in my local time, so 2022-11-30T19:22:57.472Z
I was watching for the latest logs, and not realizing they were appearing with a ts two hours earlier. This certainly breaks causality!
Thanks for your help!
P.S. It is still a mystery to me how the original filter was producing correct timestamps, but at this point is something I can live with.
If your timestamp is already in UTC, you should not use the timezone option in the date filter, this option is to inform the date filter that the date string that it will parse is from a different timezone from UTC.
If the date string already has a timezone offset, you also should not use the timezone option.
Elasticsearch only store dates in UTC and Kibana always convert back from UTC to your local timezone.
In Kibana you can check it looking at the json of the document, in the table view you will have the date converted to your local timezone, but in the json view you will have the value without convertion, which will be in UTC.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.