Filebeat does not exit on EOF (input_type=stdin)


(Arnau Sanchez) #1

I want to send logs to logstash from the command line, so I thought using filebeat with input_type=stdin should work. Using ArchLinux, filebeat-5.1.1-1. My filebeat.yml:

filebeat.prospectors:
  - input_type: stdin

output:
  logstash:
    hosts: ["localhost:5044"]

I can send a message using that prospector, and the message is indeed received by logstash, but filebeat does not terminate when the pipe is closed, so I cannot use it in an automated script. Example:

$ echo "message" | filebeat -c filebeat.yml -e -v
2016/12/20 09:42:49.649495 beat.go:267: INFO Home path: [/tmp] Config path: [/tmp] Data path: [/tmp/data] Logs path: [/tmp/logs]
2016/12/20 09:42:49.649539 beat.go:177: INFO Setup Beat: filebeat; Version: 5.1.1
2016/12/20 09:42:49.649600 logstash.go:90: INFO Max Retries set to: 3
2016/12/20 09:42:49.649657 outputs.go:106: INFO Activated logstash as output plugin.
2016/12/20 09:42:49.649732 publish.go:291: INFO Publisher name: latitude
2016/12/20 09:42:49.649876 logp.go:219: INFO Metrics logging every 30s
2016/12/20 09:42:49.649926 async.go:63: INFO Flush Interval set to: 1s
2016/12/20 09:42:49.650014 async.go:64: INFO Max Bulk Size set to: 2048
2016/12/20 09:42:49.650196 beat.go:207: INFO filebeat start running.
2016/12/20 09:42:49.650225 registrar.go:68: INFO No registry file found under: /tmp/data/registry. Creating a new registry file.
2016/12/20 09:42:49.650350 registrar.go:106: INFO Loading registrar data from /tmp/data/registry
2016/12/20 09:42:49.650376 registrar.go:131: INFO States Loaded from registrar: 0
2016/12/20 09:42:49.650396 crawler.go:34: INFO Loading Prospectors: 1
2016/12/20 09:42:49.650543 crawler.go:46: INFO Loading Prospectors completed. Number of prospectors: 1
2016/12/20 09:42:49.650558 crawler.go:61: INFO All prospectors are initialised and running with 0 states to persist
2016/12/20 09:42:49.650595 registrar.go:230: INFO Starting Registrar
2016/12/20 09:42:49.650619 sync.go:41: INFO Start sending events to output
2016/12/20 09:42:49.650650 spooler.go:63: INFO Starting spooler: spool_size: 2048; idle_timeout: 5s
2016/12/20 09:42:49.650664 prospecto`Preformatted text`r.go:111: INFO Starting prospector of type: stdin
2016/12/20 09:42:49.650791 log.go:84: INFO Harvester started for file: -
2016/12/20 09:42:49.650865 log.go:107: INFO End of file reached: -. Closing because close_eof is enabled.
2016/12/20 09:42:54.660838 state.go:112: ERR State for  should have been dropped, but couldn't as state is not finished.
[... process continues to run... ]

Am I doing it wrong? Thanks.


(ruflin) #2

Filebeat is not designed to stop when the pipe closes. I'm thinking if we could do something here combined with the -once command line option. I actually never tried that.


(Arnau Sanchez) #3

-once
Run filebeat only once until all harvesters reach EOF

I missed that, thanks ruflin, it works!

If there is a simpler way to send messages to logstash from the command line, I'd be glad to know it. For now, this covers my needs.


(ruflin) #4

Thanks for testing. Good to know it also works for stdin.

What do you mean by simpler? :slight_smile:


(Arnau Sanchez) #5

Update: When using -once, it sometimes sends the data, sometimes it exits immediately, and the data is not sent. If I run the command without a pipe (so the data is to be read from the terminal), it does not wait for EOF, it exits immediately.


(Steffen Siering) #6

is shutdown_timeout set in your config file? You want to disable shutdown_timeout to wait for the publisher pipeline to finish pushing events on shutdown (after file has been read).


(Arnau Sanchez) #7

is shutdown_timeout set in your config file?

It wasn't, but the behaviour is the same once I've added it.

filebeat:
  shutdown_timeout: 2s
  prospectors:
    - input_type: stdin

output:
  logstash:
    hosts: ["localhost:5044"]

Output of the commands, one when it works, the other when it doesn't.

http://download.zaudera.com/public/filebeat-out-notsent.log
http://download.zaudera.com/public/filebeat-out-sent.log


(Steffen Siering) #8

well, shutdown_timeout should be set to 0.

Checking filebeat-out-notsent.log and filebeat-out-send.log I can see the shutdown timer being started:

filebeat.go:155: INFO Shutdown output timer started. Waiting for max 2s.
signalwait.go:76: INFO Continue shutdown: All enqueued events being published.

The shutdown timer will kill filebeat even if events have not been published yet. The timer is started after the complete file has been read in memory. That is, after reading last 4k lines from input file, filebeat/logstash has 2 seconds to index the events.

Setting shutdown_timeout: 0 should disable the timer.

Alternatively to filebeat you can e.g. use netcat (nc command) to push your file if you configure the TCP input in logstash.


(Arnau Sanchez) #9

shutdown_timeout should be set to 0.

Ah, sorry. Tried with 0, same behaviour. It works half of the times, give or take.

use netcat (nc command) to push your file if you configure the TCP input in logstash.

Indeed. I can make it work with nc/plugin-tcp and curl/plugin-http, but there is a particular logstash server I cannot modify, and it has only beats as input.


(Steffen Siering) #10

Can you share a a debug log with shutdown_timeout: 0 ? This definitely sounds like a bug :frowning:


(Arnau Sanchez) #11

Sure:

$ echo "hello" | filebeat -c filebeat.yml -e -v -once

http://download.zaudera.com/public/filebeat-shutdown-timeout-zero.log


(Steffen Siering) #12

The log states the message has been send:

2016/12/27 18:56:14.786047 signalwait.go:76: INFO Continue shutdown: All enqueued events being published.

Now it would be interesting to have a look at the logstash debug output to check the message actually being send to logstash. Logstash does ACK events once received and put into it's internal (in-memory) queue. But if logstash is down or a filter drops the message, the event might silently be dropped.

I guess you don't have access to the logstash machine?

It would be interesting to see a pcap:

   $ tcpdump -w trace.pcap -i <network interface> tcp port 5044

If we have a trace with single incident of an event not being published, but filebeat being shutdown we can check on network level if event has been correctly published to logstash or not.


(Arnau Sanchez) #13

The log states the message has been send

When some data is actually sent, the process takes ~ 5 seconds, here is the log:

http://download.zaudera.com/public/filebeat-shutdown-timeout-zero-sent.log

I guess you don't have access to the logstash machine?

No, but I am testing against a local logstash. But I don't think the logstash server is relevant, when data is not sent, it does not even connect. Actually, I don't have no special setup, maybe you can check if the problem is reproducible.

#filebeat.yml
filebeat:
  shutdown_timeout: 0s
  prospectors:
    - input_type: stdin

output:
  logstash:
    hosts: ["127.0.0.1:5044"]

Command:

$ echo "hello" | filebeat -c filebeat.yml -e -v -once

(Steffen Siering) #14

Thanks.

I found a race-condition due to incorrect shutdown signaling in crawler/prospector/harvester shutdown. That is, shutdown timeout is working correctly once the events enter the pipeline, but shutdown is not correctly waiting for the harvester (stdin reader) being finished and simple continues with shutdown. This results in a race condition between the shutdown routine and the still active reader trying to publish an event. One time I even found the harvester being started after crawler/prospector shutdown. Unfortunately a fix for this is not that easy.


(ruflin) #15

@steffens As far as I understand, this only concerns stdin, not log prospectors / harvesters? I kind of worried about something like this as when we implemented -once we focused on the log prospectors to shutdown properly and did not spend time with stdin.


(Steffen Siering) #16

@ruflin testing with actual log-file -once flag seems to wait for file being published correctly.


(ruflin) #17

@steffens @tokland We should probably open a github issue to track this?


(Steffen Siering) #18

I created this github issue


(system) #19

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.