How to use the stdin prospector

Hey guys,

I've been setting up a filebeat 6.2 docker image with a custom filebeat.yml that looks like below:

#=========================== Filebeat prospectors =============================

filebeat.prospectors:

- type: stdin

  # Change to true to enable this prospector configuration.
  enabled: true

#----------------------------- Logstash output --------------------------------
output.logstash:
  hosts: ["${LOGSTASH_HOST}"]

logging.level: info
logging.to_files: true
logging.metrics.enabled: false

and I'm trying to see that this works as expected. How can I send a meesage to filebeat so that I can see that the message ends up in Logstash?

I noticed the filebeat container starts correctly:

docker logs -f tracing_filebeat_1       
2018-02-08T09:51:50.164Z        INFO    instance/beat.go:468    Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs]
2018-02-08T09:51:50.165Z        INFO    instance/beat.go:475    Beat UUID: 3c9d2e1d-b36b-4402-93f0-97251d7834b0
2018-02-08T09:51:50.165Z        INFO    instance/beat.go:213    Setup Beat: filebeat; Version: 6.2.0
2018-02-08T09:51:50.165Z        INFO    pipeline/module.go:76   Beat name: 616ab57f5932
2018-02-08T09:51:50.166Z        INFO    instance/beat.go:301    filebeat start running.
2018-02-08T09:51:50.166Z        INFO    registrar/registrar.go:71       No registry file found under: /usr/share/filebeat/data/registry. Creating a new registry file.
2018-02-08T09:51:50.172Z        INFO    registrar/registrar.go:108      Loading registrar data from /usr/share/filebeat/data/registry
2018-02-08T09:51:50.172Z        INFO    registrar/registrar.go:119      States Loaded from registrar: 0
2018-02-08T09:51:50.172Z        WARN    beater/filebeat.go:261  Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
2018-02-08T09:51:50.172Z        INFO    crawler/crawler.go:48   Loading Prospectors: 1
2018-02-08T09:51:50.172Z        INFO    crawler/crawler.go:82   Loading and starting Prospectors completed. Enabled prospectors: 1
2018-02-08T09:51:50.173Z        INFO    log/harvester.go:216    Harvester started for file: -
2018-02-08T09:51:50.173Z        INFO    log/harvester.go:239    End of file reached: -. Closing because close_eof is enabled.

I've tried to identify the filebeat stdin so that I could send a basic echo test to it, but there is no stdin for filebeat:

bash-4.2$ ps ax | grep filebeat
    1 ?        Ssl    0:00 filebeat -e
   31 ?        S+     0:00 grep filebeat

bash-4.2$ echo test > /proc/1/fd/0
bash: /proc/1/fd/0: No such file or directory

The question is then whose stdin is the filbeat process using?

Thanks,
Cristi

As you run filebeat via docker you must tell docker about stdin. Normally via docker run -i -t ....

Why do you want to test with stdin?

For testing you can also have test log file with 1 line contents available and check if filebeat publishes it.

Anything in particular why you want to have this kind of testing?

Thanks for the info. I want to use stdin for testing because I've fired up the ELK + filebeat containers using docker compose and wanted to see something in Kibana that is not static in a file and could not do it. do you have a suggestion on how to use the stdin in this case?

The purpose would be that filebeat should forward some log entries that are created by another docker container and at the moment our docker container publishes the logs at the stdin. So I was wondering how to connect the stdin of the other docker container to the stdin that is used by the filebeat container.

In the mean time, I've found out about the docker prospector and that by mapping the docker host's logs folder into the filebeat container at the path /usr/share/filebeat/prospectors.d you can achieve the goal above, but form the documentation I also found out that the docker prospector is still experimental.

I don't think you can pipe stdout/err from one container into another containers stdin.
With docker, one normally uses stdout/stderr for logging. The logging driver in docker is responsible for handling the log line. There are a many drivers and strategies to logging, each with their own pros/cons and gotchas. For docker logging + filebeat we encourage the use of the json log driver (default, but please configure log rotation) and filebeat collecting all docker container logs (either run filebeat as container or on host).

The docker prospector pre-parses the json logs. As it's a very new feature, it is marked as experimental. But your use-case is exactly why it was introduced.

The use-case we have is two-folded:

  1. We want to stream the logs of several containers running on a host to logstash using Filebeat. In this case we could use the approach you suggested : run the Filebeat container and configure it to watch over all the docker logs on the host.
  2. In each container we are going to write some tracing files and we want to use Filebeat to stream those tracing files to logstah. Now, for this part, it is more difficult to use the same approach as above because we are running in a Kubernetes environment and we want to keep those tracing files (so keeping those on a hostPath volume is not an option because the host machine might crash and then we would loose all those files). So we thought that we could use the Filebeat container as a sidecar container (deploy it together with our main containers in the same pod) and make it use the volumes of the main containers.

But in this case, how do we get access to the logs of the main container, because I think it is too much overhead to have Filebeat both as a sidecar container and as a Daemonset (independent container)?

Ah, the pains of logging in container environments.

There are a few approaches to your problem. Each with their own advantages and disadvantages. Given you have multiple log files, you might consider to not have a mix of solutions, but one true way:

  • Write all logs to a volume (no logs on stdout/stderr). Advantage: logs do not pass the docker daemon, which imposes a risk of running into an OOM in the daemon. Disadvantage: one must provide some log-rotation to not run out of disk space.

    • a) pod local volume + sidecar shipper like filebeat (disadvantage: one filebeat running per container)
    • b) global volume or host mount with one global filebeat (disadvantage: no good about disk usage per pod)
  • Always write all logs to stdout/stderr. Disadvantage: all logs pass docker daemon, doesn't work well with software writing multiple logs. Advantage: one global daemonset for shipping logs. Using this approach applications with multiple logs can be integrated by using a shared volume and using 1 or 2 sidecar containers. The first sidecar rotates the logs (if applications log writer does not support rotation) and the second sidecar will print the log to stdout. The 'streaming-sidecar' can be as simple as tail -f -n+1 <path/to/log>.

See k8s Logging Architecture docs for sample solutions.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.