We just recently started playing with filebeats and see a lot of usefulness. But we got a bit stuck with scaling, configs management, and some limitation(by design https://github.com/elastic/beats/issues/1112) for outputs.
Our pipelines look like these.
server -> kafka -> logstash -> elasticsearch
or
server -> kafka -> samza -> elasticsearch
For delivering and deploying we are using puppet. So it make sure it pushes and installs filebeats. As for config management we have inhouse developed framework(cover all our apps), which require substantial changes to accommodate filebeat deployment. We POC it and it is working but has some limitation on metadata to support multiple topics for multiple filebeat processes on the server.
I was wondering how others scale beats in env, assuming we have multiple files on each server that we need to filebeat and deliver to multiple different topics(lets say only to one broker for now)?
I don't fully understand the actual configuration problems you're facing. More important, what exactly you want filebeat todo.
Sending to kafka you want to push to multiple topics? One case use output.kafka.use_type and filebeat.prospectors.X.document_type, to configure different topics per prospector-type. Support for choosing topics might be enhanced in future versions.
Filebeat supports environment variable for changing settings + some configurable 'config'-directory. The directory support allows you to put multiple prospector configurations into one directory. e.g. use puppet on machine type to put some config per service into config directory. After restarting filebeat the prospector configs are merged with main filebeat config.
Recent nightly builds support:
load multiple config files by using -c <file> option multiple times
overwrite any config setting from command line using -E <setting>=<value>.
I was not sure how to push different events to different topics per prospector. For example app generates three different files and I would like to push those files to different topics. From what you are saying we need to do this.
filebeat.config_dir - would have all our prospectors configuration.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.