Filebeat service looses track of files (restart required)

Hi!

I'm having somewhat similar issue to other people here in discussions. Here's what happens:

When run manually with sudo filebeat -c /etc/filebeat/filemeat.yml all works fine and filebeat continuously ships new events further (redis+logstash in my case).

However that is not the case when run as a service. It that instance, the service starts fine and even works for 2-3 scan cycles (im in loglevel=debug) then stops detecting new lines in logfile, so nothing gets shipped.

At that point a restart is required so the new log entries are picked up and shipped. After a restart it works again for a couple of scan cycles and then looses track of logfile again. (Just to reiterate: this does not happen when run manually)

Here's what I did to reproduce this behaviour:

  1. Dropped previous filebeat logs and restarted the service
  2. Appended some lines to a logfile, they went through
  3. Appended some more, they got sent as well
  4. On the third scan cycle filebeat stopped detecting changes
  5. Restarted filebeat (new logfile created)
  6. Filebeat picks up undetected lines from before and sends them over

I will attach both logfiles of filebeat working and failing, and then being restarted when it picks up old lines.

My Set up is pretty minimal:


filebeat:
  registry_file: /var/lib/filebeat/registry
  prospectors:
    -
      paths:
        - /var/log/nginx/*.log
      encoding: utf-8
      input_type: log

output:

  console:
    pretty: false

  redis:
    host: "elastic1.logging"
    port: 6379
    password: "same_behaviour_without_redis_auth"
    index: "logstash"


logging:

  level: debug
  to_syslog: false
  to_files: true
  files:
    path: /var/log/filebeat
    name: filebeat.log
    rotateeverybytes: 10485760 # = 10MB
    keepfiles: 7

Logfiles:
(sorry for gist, but txt attachments are not allowed here)

I'm guessing something is wrong with this filebeat version, (filebeat-god possibly?)

Tried filebeat 5 alpha and the issue seems to be gone. Still would like to know how to fix the 'stable' version, as I'm not really into alphas on our production servers...

@Dmitry_Belyakov Thanks for sharing all the details. Just for confirmation:

  • 1.2.3 manually works
  • 1.2.3 with god does not work
  • 5.0.0-alpha4 with god works

Is the above correct?

One issue is that you are using redis output which is not recommended to be used in 1.x versions. It was completely rewritten for 5.0. Could you test if the same issue persists if you write to file instead of redis? In the logs there are some redis errors here: https://gist.github.com/dmitrybelyakov/a87ef45fd31b6c42926962cacebdd52b#file-filebeat-working_and_failing-log-1-L146

I'm still somehow surprised that it works without god, but with it doesn't :frowning:

Hi!

Yes the versions are correct. I tried 1.2.3 with file output and can confirm that the described behaviour does not happen with file output

So my understanding that it's the redis output here to blame - it fails badly which messes up the registry. Right? So what will be the suggested way of resolution:

  • Not use redis?
  • Wait for version 5? (when is it expected by the way?)

We honestly weren't aware the redis output was not recommended for usage...

redis output in 1.x release is not recommended with filebeat, due to not correctly handling errors. For version 5.0 (you can test with alpha4), redis output has been rewritten supporting proper error handling + retries, load balancing, SSL support, SOCKS5 proxy support.

Hey, Steffen!

Yeah, I figured this out at this point. But does this mean the same will be true for all the beats? Or is it filebeat-specific?

it's all the beats, but different beats have different requirements. Only filebeat/winlogbeat do require infinite-retry on failure. Other beats do allow for data-loss. All other outputs but redis to retry up to 3 times before dropping the event. Retry limit is configurable (if set to -1 infinite retry).

Hey guys,

Thanks for your help. All sorted and working fine now.

Cheers!

This topic was automatically closed after 21 days. New replies are no longer allowed.