Using Filebeat on files synced via Rsync

I am using Filebeat 6.3.0 to forward log files from a Rails application to Logstash. The files originate from a server that I cannot configure to use Filebeat, hence they are being transferred to the Filebeat/Logstash host via an rscript call.

The logfiles rotate on the server, so every once in a while, the existing log is rotated out to a .log.0 file.

Basically, what I am doing is this:

rsync -avh user@remote:/path/to/logs/ /path/to/logs/

I have configured Filebeat to read the log files with the following input directives:

- type: log
  enabled: true
    - /home/user/logs/production.log*
    type: rails_production
  exclude_lines: ['^#', 'CSRF token authenticity']

- type: log
  enabled: true
    - /home/user/logs/lograge_production.log*
    type: rails_lograge
  exclude_lines: ['^#']
  json.ignore_decoding_error: true
  json.keys_under_root: false
  json.add_error_key: false
  json.message_key: message

I have read this topic where it is suggested that if the files are being appended to, I should be fine. But as far as I can see, rsync only transmits the delta anyway (as --no-whole-file is the default for network transmissions).

Are my settings correct?

(The reason I am asking is that I am seeing input spikes in my timestamps, and other periods where there are no data being ingested for several hours, and this actually cannot be the case, as my data is coming in more or less constantly.)

@slhck I am not sure, but I would expect that you experience data duplication occurring when the rotation happens.

Did you look at either the Filebeat TCP input or Logstash TCP input instead of using rsync? I think it easier in your case.

@slhck I've checked the rsync doc did you look at --append and --append-verify options?

Rsync transmits delta, but I think when the merges occur it create a new file on disk with a new inode.

1 Like

I unfortunately cannot use a push-based TCP from the web server to the Logstash server; I have to pull the data via rsync.

I will check the append options, thanks for the pointer!

i also use rsync to transfer logs from an old server and had similar problems

what works for me is to explicitly specify the logs which should be synced and only sync so that only one type of log is synced with one run and pay attantion to the rollover number they must be (for me) alphabetically continously (no 10 between 1 and 2)

rsync -az --delete-after <path to the log directory>/*.log.[1-9] root@<server to sync to>:/root/copied_logs/<hostname>/<subdirs>/

P.S. i don't sync the real time log only the historical logs, but for me this creates only a gap of one hour

Thanks, but that seems like a more convoluted approach.

I am trying the --append version now and it seems to do the job, but I have to wait for the first rotation to happen. Will keep you posted.

So, the --append option on its own does not work. As soon as the file gets rotated out, Filebeat apparently does not sync the new one anymore:

(It's 21h now and no new data is coming in.)

In fact, using --append, even some old logs stopped being written to on the target server, hours ago. Only when I leave out --append, all the data is correctly transferred.

It seems that the only option is to rotate frequently and only sync full files (i.e. the ones ending with a number).

on my setup there was also the problem that newly created logs from rsync got the inode from the log that was deleted and so filebeat has continued to read from the last known position in this log. See if that is also a problem for you.

i think i had a discussion with a elastic teammember on a github issue about using rsync and that should no longer happen but i'm not sure it was solved


Option "--append" for Rsync solves mentioned problem.

I think that my original solution works fine (i.e., not using --append), as long as Filebeat is able to process an entire log file before it gets rotated over.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.