Filebeats is not detecting new log entries


(Kevin B) #1

OS: CENTOS 6
kernel 2.6.32-573.12.1.el6.x86_64
Filebeats: filebeat-5.0.0-alpha2-x86_64

I am trying to run filebeats with the kafka output, and have run into issues. When I start filebeats, it will harvest the files up to the first 2048 entries (per the default config). After that it does not detect any more log entries. When I restart filebeats, it will grab the initial 2048 lines again, then fail to detect further changes

Filebeats config is here: http://pastebin.com/raw/0GYMWxhJ

I have a python script that is generating log entries, and I've been actively adding data to the logfiles with filebeats running and it just doesn't detect changes to the file. I've tried using bash to append new lines ( echo "testing" >> file) without success. I've tried setting tail_files: true and it hasn't helped.

Reading through previous threads and github bugs it looks like this issue has come up before, but none of the solutions have worked for me.

Thank you for any help with this,
Kevin


(Michael) #2

does /opt/registry get created? Does it contain a line for your log showing the offeset?


(Kevin B) #3

Hi Michael, thank you for the quick reply

The file /opt/registry was created, and it contained entries for each logfile. I noticed the offset looked incorrect though. Each file was about 30MB, the offset was about 29 million bytes, but it didn't send anywhere close to that amount of data. I am away from that environment today, but I will send more specific details when I get home this evening

Thanks for looking into this,
Kevin


(Kevin B) #4

Hi Michael-

I ran through some more tests and got mixed results

  1. delete /opt/registry

  2. gen 1000 lines of logs

  3. start filebeat
    1000 lines pushed to kafka
    /opt/registry does not exist yet

  4. stop filebeat ( ctrl-c )
    /opt/registry does not exist
    logging ends with "filebeat cleanup"

  5. start filebeat again
    same 1000 lines processed again, this is expected
    /opt/registry does not exist yet

  6. stop filebeat ( ctrl-c )
    /opt/registry.new is created, but is empty
    logging ends with "Write registry file: /opt/registry"

====== clean up =====
delete logfile
delete /opt/registry.new

  1. gen 5000 lines of logs
  2. start filebeat
    2048 flushed
    2047 flushed
    905 flushed
    no /opt/registry file created
  3. gen 5000 more at 2016-05-06T19:32:54-04:00
    nothing processed
  4. gen 5000 more at 2016-05-06T19:36:33-04:00
    nothing processed
  5. gen 5000 more to different logfile at 2016-05-06T19:38:18-04:00
    file detected, log shows harvesting, nothing processed

still no /opt/registry file
6) stop filebeat ( ctrl-c )
no /opt/registry file created
STDOUT shows the following, this is different from previous iterations where nothing showed on STDOUT

[/opt] # ./filebeat -c ./filebeat.test.yml
^Cpanic: send on closed channel

goroutine 34 [running]:
panic(0x8cc420, 0xc82066a3d0)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
github.com/elastic/beats/filebeat/crawler.(*Prospector).Run.func1(0xc8201641c0)
/go/src/github.com/elastic/beats/filebeat/crawler/prospector.go:99 +0x139
created by github.com/elastic/beats/filebeat/crawler.(*Prospector).Run
/go/src/github.com/elastic/beats/filebeat/crawler/prospector.go:102 +0x11b
[/opt] #

I am seeing some kafka errors every 60 seconds, could a failure to send to kafka cause filebeats to skip processing that batch of 2048 lines? It reconnects every time, would this just be different settings between kafka and filebeats regarding the lifetime of an open connection?

Example:

2016-05-06T19:40:49-04:00 WARN kafka message: client/metadata got error from broker while fetching metadata:%!(EXTRA *net.OpError=read tcp 172.31.254.130:33359->10.99.10.225:9092: i/o timeout)
2016-05-06T19:40:49-04:00 WARN Closed connection to broker 10.99.10.225:9092

2016-05-06T19:40:49-04:00 WARN kafka message: client/metadata no available broker to send metadata request to
2016-05-06T19:40:49-04:00 WARN client/brokers resurrecting 1 dead seed brokers
2016-05-06T19:40:49-04:00 WARN client/metadata retrying after 250ms... (2 attempts remaining)

2016-05-06T19:40:49-04:00 WARN client/metadata fetching metadata for all topics from broker 10.99.10.225:9092

2016-05-06T19:40:49-04:00 WARN Connected to broker at 10.99.10.225:9092 (unregistered)

2016-05-06T19:40:54-04:00 INFO Run prospector

I'm running through some additional testing with a file output to see if this is related to the kafka output, but we really need this working to kafka

Thanks again for your help,
Kevin


(Steffen Siering) #5

Which kafka version have you running?

Any interesting logs on kafka side?

2016-05-06T19:40:49-04:00 WARN kafka message: client/metadata got error from broker while fetching metadata:%!(EXTRA *net.OpError=read tcp 172.31.254.130:33359->10.99.10.225:9092: i/o timeout)

This message is complaining about kafka broker not responding in time. The timeout you've configured is 60 seconds (request going on for 60s without response is hugh). Is kafka still operating correctly when this happens? Any network problems (you run on local host or behind firewall/NAT)?

The max_retries option is ignored by filebeat, as filebeat requires send-at-least-once semantics. filebeat requires ACK from kafka in order to continue sending new log entries. There is even a chance of kafka having received batches of log-lines, but filebeat never receiving the ACK, forcing filebeat to -re-send the log-lines.


(system) #6