Filebeat still scanning files, but not finding new log entries

Hello! I'm using filebeat-1.2.3-1.x86_64 on a CentOS 6.5 host. The problem I'm having is that after I start filebeat through the init process, it runs for a short period (what seems like the initial batch) and then anthough it is still running (I see it running due to it being set to log level debug), it stops finding new log entries.

I don't seem to have the same problem when running it in debug in the foreground.
filebeat -c /etc/filebeat/filebeat.yml -e -v -d '*'

filebeat:
prospectors:
-
paths:
- /var/log/messages
- /var/log/yum.log
- /var/log/secure
- /var/log/puppet-agent-err.log
- /var/log/puppet-agent.log
type: log
input_type: log
document_type: syslog
scan_frequency: 10s
tail_files: true
partial_line_waiting: 1s
-
paths:
- /var/log/nova/.log
type: nova
document_type: nova
tail_files: true
scan_frequency: 10s
ignore_older: 24h
exclude_files: [".gz$"]
multiline:
pattern: "2[0-9]{3}-[0-9]{2}-[0-9]{2}[[:space:]][0-9]{2}:[0-9]{2}:[0-9.]+[[:space:]][0-9]+[:space:]([[:space:]][a-z._]
)([[:space:]]+$|[[:space:]]{5}.|[[:space:]]+[a-zA-Z:].)"
negate: false
match: after
multiline:
pattern: "2[0-9]{3}-[0-9]{2}-[0-9]{2}[[:space:]][0-9]{2}:[0-9]{2}:[0-9.]+[[:space:]][0-9]+[:space:]([[:space:]][a-z._])([[:space:]]{3,}?)"
negate: false
match: after
-
paths:
- /var/log/libvirt/
/.log
type: libvirt
tail_files: true
scan_frequency: 120s
ignore_older: 24h
exclude_files: [".gz$"]
registry_file: /var/lib/filebeat/registry
output:
logstash:
hosts: [
"10.0.142.10:5044",
"10.0.142.11:5044",
]
file:
path: "/tmp/filebeat"
filename: filebeat
console:
pretty: true
shipper:
logging:
to_syslog: false
to_files: true
files:
path: /var/log/
name: mybeat.log
rotateeverybytes: 10485760 # = 10MB
keepfiles: 7
selectors: ["
" ]
level: debug

Any ideas how to troubleshoot this problem?

I would also state that I didn't see this issue on CentOS 7, which is what I did my POC on - the problem is now I also need to collect logs from CentOS 6 hosts...

What do you mean by: process is killed by the system for not having activity in the term. Why exactly is this happening?

We have a timeout on the shell - if I leave it overnight, my login shell (and any processes that are running it it) will get killed.

Filebeat been running for hours though on multiple nodes in the foreground:
filebeat -c /etc/filebeat/filebeat.yml -e -v -d '*'

I doubt that has anything to do with it. The filebeat process started via init stops processing files immidiately after startup. It sends off a batch of logs to logstash, and then just stops seeing new log entries. Its still alive, it just isn't processing new logs. It says the file didn't change, even though it has. If I log out, and log back in - when filebeat is running in the foreground, it catches the change to /var/log/secure, but if running in the background (after that initial batch) it won't get it.

Why don't you run it as a service then?

I want to run it as a service. Maybe I'm missing the point of your comment.

I'm not sure why it stops reading files if I just start it through the init script.
/etc/init.d/filebeat start

If I run it in the background it will continue to run to infinity (and beyond):
nohup filebeat -c /etc/filebeat/filebeat.yml -e -v -d '*' &

But that isn't really ideal.

Is anyone else having this problem with filebeat on CentOS 6?

I have 40 nodes up now (running filebeat) in my OpenStack cluster - and I'm deploying filebeat using puppet, so all nodes have the same configuration. These are all compute (hypervisor) nodes, and have generally the same configuration, and they all have the same software packages installed, etc.

Only half of the nodes continue to find and forward logs to my logstash server. This is better than I originally though, because at least some of them are forwarding logs without my having to run
nohup filebeat -c /etc/filebeat/filebeat.yml -e -v -d '*' &
on them. I originally thought that none of them would forward when started with /etc/init.d/filebeat start (started by puppet). So some do, some don't.

Really - no-one else is seeing this?

@Todd_Ruch Are the servers which get "stuck" not forwarding any logs or get stuck after some time? Perhaps you are hitting this issue here? https://github.com/elastic/beats/issues/1974 Could you try if it still happens with the most recent builds? https://beats-nightlies.s3.amazonaws.com/index.html?prefix=filebeat/

@ruflin: Thank you for the response.

They start sending when I do an /etc/init.d/filebeat restart, however immediately stop. I can see the initial 4 or 5 logs in /tmp/filebeat/filebeat (file output) after I restart, but then it goes radio silent (I'm also searching in elasticsearch/kibnana, and don't see any further logs from said host).

I tried the most recent build as you suggested and unfortunately this issue is still occurring.

I'm still having this issue. I have verified that the env variables are the same when I'm running it via init script, or direct via binary. The only difference I can see here is the use of filebeat-god.

I have put the output of my strace's in pastebin.

Filebeat not working - run from init script:
http://pastebin.com/11nJ9kh3

Filebeat working - run from binary:
http://pastebin.com/11nJ9kh3

I'm going to attempt to look at the filebeat-god code - but if anyone speaks go, and can help with this, it would be very much appreciated.

@tudor Perhaps you have some more ideas here?

I made some changes to the init script, removing filebeat-god from the process, and just starting /usr/bin/filebeat, and this seems to have resolved the issue. Now I still have the convenience of the init script, and I can restart from Puppet after a config change, and everything works - woohoo!

I'm sure there must be a reason not to do this - I'll continue to read up on this.

start() {
echo -n $"Starting filebeat: "
test
if [ $? -ne 0 ]; then
echo "Bad config dude!"
exit 1
fi
if [ -f /var/run/filebeat.pid ] && [ $(pidof $agent) ]; then
echo "Process already running"
exit 1
else
nohup $agent $args > /dev/null 2>&1 &
fi
PID=$!
RETVAL=$?
echo $PID > $pidfile
#daemon $daemonopts $agent $args > /dev/null 2>&1 &
echo
return $RETVAL
}

This topic was automatically closed after 21 days. New replies are no longer allowed.