The events are sent to elasticsearch and the latest event override the previous one of the same ID, what is necessary for me.
So my problem is, that
First log entry is sent to ES
File rotation happens -> my.log is renamed to my.log.1
Second log entry is written to new my.log
Filebeat correctly harvesting both files, but in that case a few lines of the my.log are sent to ES before the my.log.1 is harvested to the end.
my.log.1
{id: "1", status: "Running"}
my.log
{id: "1", status: "Finished"}
Here the config (filebeat 5.0)
filebeat.prospectors:
# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.
- input_type: log
paths:
- /var/log/smec/*.log*
encoding: utf-8
document_type: TaskLogEntry
scan_frequency: 1s
What would be the correct way to read out the old (renamed) log file before start reading the new one?
Thats too bad that this is not somehow possible in filebeat. In this case, it is not possible in Kibana, because the two log lines have the same ID, and so the latter one overrides former.
In my case, sometimes the logical latter one, is sent to ES before the logical former one (in case of a file rotation).
That means I don't have a chance with the current filebeat to solve this problem somehow.
Correct. I have a workflow, and a Step in the workflow create different events. In my simple case it starts with RUNNING and ends with FINISHED. Only the (current) latest state should be displayed in ES so we just use the same document-id to override any older Documents in ES.
But when this two events (RUNNING and FINISHED) are written to the log file AND the situation occurs, that a file rotations happens exactly in between those two, there is the chance (and sadly this happens quite often), that the new log file is harvested before the old existing one is again harvested to EOF.
If you have a timestamp associated with your event in the log, although that is not the case in the example you provided, you could send the data to Logstash as I believe it supports scripted updates that would allow you to check the timestamp and only update it it is newer or no document currently exists for that ID.
Hi, in fact I have a timestamp available in each log-line. @ruflin: How would sorting solve the problem? I only wan't to see the latest version of a "log line" in ES so I use the same document-id to override older ones.
@Christian_Dahlqvist: sounds very nice. I will try this out and let you know if works for me (I guess it will).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.