order.log.20160403:1342:12:02:29 AM | INFO | Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. | Lorem ipsum dolor sit amet, | XYZ
order.log.20160403:2817:12:19:25 AM | INFO | Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. | Lorem ipsum dolor sit amet, | XYZ
I have reverified the grok on logstash and it is perfectly fine as other logs are getting parsed. Some of the logs matching the above pattern are completely missed out on while forwarding.
Log rotation policy :
The current logs are written to a file named order.log. The file is renamed to order.log.<yesterday's date> at midnight.
So some log entries are missing, not entire log files.
I see you set ignore_older to 24h. Is there a specific reason for that? In filebeat 1.2.0 ignore_older is set to infinity by default but close_older was introduced with 1h as default to close files.
Do you log files always have an empty line between 2 events?
I see the same problem. It is not one event here and there but quite many all the time.
I have an application with a debug log which I import to logstash using filebeat (tailing file).
The app produces 383.10 KB/s, 8941 trace lines/s.
A certain event I looked at shows up 230 times/minute in original log.
In Kibana I only see this event 10 times per minute.
I tried with logstash-beats-plugin 2.2.8 and bulk_max_size =10000 but there is no difference.
Parsing my application log looking for a certain event:
NEW minute: 2016:04:25 10-00-00.163 227 hits/min
NEW minute: 2016:04:25 10-01-00.013 219 hits/min
NEW minute: 2016:04:25 10-02-00.003 225 hits/min
NEW minute: 2016:04:25 10-03-00.010 217 hits/min
NEW minute: 2016:04:25 10-04-00.008 229 hits/min
Here is what Kibana looks like using the same filter:
The events are just test strings indicating that an event has occurred in my application.
The string has a date and some information.
My logfile can grow very big (gigabytes after a couple of hours) so I thought it was a good idea to use tail.
Is there a reason why I should not use tail?
Your recap is correct:
The Filter was: type:vuehlog AND Ft_UEH_UEHANDLER* AND INITIAL_CONN_SETUP*
I do not think tail_file works since I see that filebeat is processing old data from time before staring filebeat.
One more thing I have notice is that filebeat is lagging. I compare the source and the file filebeat stores.
Here we can see the difference:
First I parse the source file:
ubuntu@vueh-0:~$ cat /var/log/vueh/stdout.log | ~/tracethroughput.pl | grep hits
NEW minute: 2016-04-26 14:56:23.945 0 hits/min
NEW minute: 2016-04-26 14:58:14.713 0 hits/min
NEW minute: 2016-04-26 15:11:41.980 0 hits/min
NEW minute: 2016-04-26 15:12:00.007 18 hits/min
NEW minute: 2016-04-26 15:13:00.034 107 hits/min
NEW minute: 2016-04-26 15:14:00.139 225 hits/min
NEW minute: 2016-04-26 15:15:00.167 232 hits/min
NEW minute: 2016-04-26 15:16:00.057 226 hits/min
NEW minute: 2016-04-26 15:17:00.087 231 hits/min
NEW minute: 2016-04-26 15:18:00.167 230 hits/min
NEW minute: 2016-04-26 15:19:00.005 229 hits/min
NEW minute: 2016-04-26 15:20:00.001 231 hits/min
NEW minute: 2016-04-26 15:21:00.010 227 hits/min
NEW minute: 2016-04-26 15:22:00.000 224 hits/min
NEW minute: 2016-04-26 15:23:00.014 232 hits/min
NEW minute: 2016-04-26 15:24:00.175 227 hits/min
NEW minute: 2016-04-26 15:25:00.030 233 hits/min
NEW minute: 2016-04-26 15:26:00.005 230 hits/min
NEW minute: 2016-04-26 15:27:00.006 223 hits/min
NEW minute: 2016-04-26 15:28:00.001 229 hits/min
NEW minute: 2016-04-26 15:29:00.016 235 hits/min
NEW minute: 2016-04-26 15:30:00.016 224 hits/min
NEW minute: 2016-04-26 15:31:00.000 227 hits/min
NEW minute: 2016-04-26 15:32:00.000 231 hits/min
NEW minute: 2016-04-26 15:33:00.098 226 hits/min
NEW minute: 2016-04-26 15:34:00.004 231 hits/min
NEW minute: 2016-04-26 15:35:00.003 229 hits/min
NEW minute: 2016-04-26 15:36:00.012 225 hits/min
NEW minute: 2016-04-26 15:37:00.005 231 hits/min
NEW minute: 2016-04-26 15:38:00.003 229 hits/min
NEW minute: 2016-04-26 15:39:00.008 229 hits/min
NEW minute: 2016-04-26 15:40:00.057 226 hits/min
NEW minute: 2016-04-26 15:41:00.023 225 hits/min
NEW minute: 2016-04-26 15:42:00.023 234 hits/min
NEW minute: 2016-04-26 15:43:00.016 230 hits/min
And then I parse the filebeat local output:
ubuntu@vueh-0:~$ cat /tmp/filebeat/filebeat | ~/tracethroughput.pl | grep hits
NEW minute: 2016-04-26 15:11:41.980 0 hits/min
NEW minute: 2016-04-26 15:12:00.007 18 hits/min
NEW minute: 2016-04-26 15:13:00.034 107 hits/min
NEW minute: 2016-04-26 15:14:00.139 225 hits/min
NEW minute: 2016-04-26 15:15:00.167 232 hits/min
At least the first 4 minutes seems to be correct.
Checking the logfile in logstash also seems to be correct the first 4 minutes:
NEW minute: 2016:04:26 15-11-41.980 10 hits/min
NEW minute: 2016:04:26 15-12-00.007 18 hits/min
NEW minute: 2016:04:26 15-13-00.034 107 hits/min
NEW minute: 2016:04:26 15-14-00.139 225 hits/min
NEW minute: 2016:04:26 15-15-00.167 232 hits/min
I have moved elasticsearch and kibana to another server to reduce load on logstash server.
Just noticed that the latest test was running with a GROK and Date filter in logstash.
That might explain why Kibana only has one minute with events and this minute has counts (82) which does not match any minute of the source.
I will repeat the test without GROK and date filter.
BRB
Mathias
PS. Time is now 15:52 and filebeat output has reached 15:16
NEW minute: 2016-04-26 15:11:41.980 0 hits/min
NEW minute: 2016-04-26 15:12:00.007 18 hits/min
NEW minute: 2016-04-26 15:13:00.034 107 hits/min
NEW minute: 2016-04-26 15:14:00.139 225 hits/min
NEW minute: 2016-04-26 15:15:00.167 232 hits/min
NEW minute: 2016-04-26 15:16:00.057 226 hits/min
I am really scratching my head now.
Why do I see results in Kibana for all minutes but I do not see them in filebeat output nor in logstash output?
Unfortunately is the number of hits per minute not correct in Kibana.
I reinstalled logstash on a machine with 2 CPUs instead of 1 and with 4Gb of memory instead of 2Gb.
It looked better. Less lagging.
I realized I needed to use grok and date filter to be able to match counts per minute.
When grok filter is enabled things becomes slow and lag increases.
Anyway, Kibana is now showing the same counts are when I parse the source file. Finally!
I will try again when I have a much better machine to test on. Hopefully there will be no delays then.
Thanks for all the updates. So as you found out it seems like LS cannot process all the log files fast enough. Filebeat in this case slows down but it will still send the events eventually when it catched up. Glad you found a solution
About tail_logs: This only works if you don't have a registry file with a state yet. I assume in your case you started filebeat before without tail_logs and it set a state. So when you enable it later and it finds a state, it will use the state instead of the tail. To prevent this, you have to remove the registry file.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.