Handeling log rotation

Hi, using 5.6.2

filebeat.prospectors:
- input_type: log
  paths:
    - /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout
    - /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stderr
  fields:
    source_type: "framework"
  fields_under_root: true
  tail_files: false
  harvester_buffer_size: 32768

I have files

stdout
stdout.1
stdout.2
Up to 10 files.

If a file rotates twice before the harvester can finish, does it mean that filbeat will not process the stdout.2 and only process stdout.1 only?

filebeat follows files by inode. If a file is renamed due to rotation, it's detected by filebeat. Plus filebeat tries to keep files open, that is it can still read deleted files. But if you're constantly writting logs faster then filebeat can process in the time windows of logs being available, you're prone to potentially loosing logs or have filebeat keep open all logs until you run out of disk space.

Understood the possibilities of dataloss, but I'm not clear on...

stdout is original, it rolls over to stdout.1, filebeat is reading both correct?
Once the logs rollover to stdout.3 will it read all 3 or drop one of them?

Also is it better for filebeat to read 10 files of 10MB each or 1 file of 100MB?

It depends. You glob pattern uses stdout only. That is, by default if stdout is rotated to stdout.1, filebeat will continue reading from the already opened file handled + starts a new harvester for stdout. And so on and so on. If filebeat is restarted in between, I won't be able to find stdout.1, due to your config not asking for stdout.1 being processed. Better have your pattern end with stdout*. In this case filebeat can also continue old log files after being restarted. A file's identity is not the filename, but the inode. Logrotation normally updates the filename using move, which doesn't change the inode. The filename is only used to find files one wants to publish.

Also is it better for filebeat to read 10 files of 10MB each or 1 file of 100MB?

I don't have a general answer for this. More files means filebeat can process files more concurrently (drawback might be more seeks on hard drives). Too many files (10 files are not many) can slow down scanning for files + creates some additional congestion on the shared event queue. But the event queue holding the events produced by each file reader is the same, no matter how many files you have. Normally the bottleneck (unless you use some weird network storage) are the outputs, not the file reading.

Given you are using mesos, I guess you're actually having 10*number of services files. In this case it might be better to reduce total number of files to be forwarded.

Yeah the default for DC/OS mesos is 10 files at 10MB. Also their docs had stdout I'm thinking stdout* is better.

By putting * it wont double process any file, just to be sure.

Files are identified by inode. That is, it won't be resend. Just make sure you don't accidentally process compressed files (e.g. stdout.8.gz)

Cool thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.