Log Rotation with two events depending on each other

Hi everybody,

I have got a problem with file rotation and logging messages that depends on each other.
Here my case:

Usually events are written to the logfile in the correct order:

my.log

{id: "1", status: "Running"}
{id: "1", status: "Finished"}

The events are sent to elasticsearch and the latest event override the previous one of the same ID, what is necessary for me.

So my problem is, that

  1. First log entry is sent to ES
  2. File rotation happens -> my.log is renamed to my.log.1
  3. Second log entry is written to new my.log
  4. Filebeat correctly harvesting both files, but in that case a few lines of the my.log are sent to ES before the my.log.1 is harvested to the end.

my.log.1

{id: "1", status: "Running"}

my.log

{id: "1", status: "Finished"}

Here the config (filebeat 5.0)

filebeat.prospectors:

# Each - is a prospector. Most options can be set at the prospector level, so
# you can use different prospectors for various configurations.
# Below are the prospector specific configurations.

- input_type: log
  paths:
    - /var/log/smec/*.log*
  encoding: utf-8
  document_type: TaskLogEntry
  scan_frequency: 1s

What would be the correct way to read out the old (renamed) log file before start reading the new one?

filebeat tries to harvest log files concurrently. It has no idea about file-order in the presence of log-rotation.

Yes, I'm aware of that. But is there a way I can configure filebeat that way, that my problem could be solved?

in filebeat directly? No. In kibana some order is implicit due to '@timestamp' field.

Thats too bad that this is not somehow possible in filebeat. In this case, it is not possible in Kibana, because the two log lines have the same ID, and so the latter one overrides former.
In my case, sometimes the logical latter one, is sent to ES before the logical former one (in case of a file rotation).
That means I don't have a chance with the current filebeat to solve this problem somehow.

Not sure I get you right, but you basically want to display some current state by overwriting entries in elasticsearch?

Correct. I have a workflow, and a Step in the workflow create different events. In my simple case it starts with RUNNING and ends with FINISHED. Only the (current) latest state should be displayed in ES so we just use the same document-id to override any older Documents in ES.

But when this two events (RUNNING and FINISHED) are written to the log file AND the situation occurs, that a file rotations happens exactly in between those two, there is the chance (and sadly this happens quite often), that the new log file is harvested before the old existing one is again harvested to EOF.

As far as I understand your problem, you could solve this by adding your own timestamp and then sorting based on it?

If you have a timestamp associated with your event in the log, although that is not the case in the example you provided, you could send the data to Logstash as I believe it supports scripted updates that would allow you to check the timestamp and only update it it is newer or no document currently exists for that ID.

Hi, in fact I have a timestamp available in each log-line.
@ruflin: How would sorting solve the problem? I only wan't to see the latest version of a "log line" in ES so I use the same document-id to override older ones.

@Christian_Dahlqvist: sounds very nice. I will try this out and let you know if works for me (I guess it will).

@guenther in case you get it working, I'd love to learn about your solution.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.