It doesn't seem that Filebeats understands, however, that logstash.service.consul points to multiple Logstash hosts. Running filebeat -httpprof 127.0.0.1:6060 and querying http://127.0.0.1:6060/debug/vars, I get a measly ~150 events/second (using the expvar_rates.py script). Making the following change to the configuration file, I am able to process ~500 events/second (which still isn't great):
which filebeat and logstash version have you installed. Using most recent 5.0 alpha versions of filebeat and logstash, you can try to enable network pipelining mode, in order to decrease network latencies.
beats connecting to logstash, query for all known IPs and choose one IP by chance.
For better load-balancing support you can increase output.logstash.worker or configure all known hosts (or some hosts multiple times). Every host configured gets a total of output.logstash.worker workers assigned. That is, your first config will get you 1 output worker and the second with all hosts configured gets you 3 output workers.
If filebeat gets back-pressure from output, it will slow down generating events (reading files).
Without knowing any details about hardware, logstash or filters in logstash, number of beats connecting to one logstash instance, support elasticsearch ingestion rate (baseline performance in general), I have a hard time commenting on number of events/second. There is currently some effort re-implementing beats input plugin in java, which should help a lot with performance (given logstash filters or output is not the bottleneck). I think 3.1.0-beta1 is based on java-based code base, but I wouldn't use it for production yet.
I'm not sure that I understand your answer. Essentially, I'm wondering how/when Filebeat resolves DNS and how/if Filebeat handles DNS records that resolve to multiple IP addresses. Specifically, if I point Filebeat to a Logstash host with a DNS name of logstash.service.consul, which resolves to three IP addresses, can I expect Filebeat to load balance traffic across these three IP addresses?
if I point Filebeat to a Logstash host with a DNS name of logstash.service.consul, which resolves to three IP addresses, can I expect Filebeat to load balance traffic across these three IP addresses?
No.
I'm wondering how/when Filebeat resolves DNS
Filebeat queries for a list of all known IPs for some given host name.
how/if Filebeat handles DNS records that resolve to multiple IP addresses
Filebeat selects one of the IP-addresses by random and attempts to create exactly one network connection.
You want to enable load-balancing either add all known IPs or set output.logstash.worker: 3 in filebeat.
So if I set output.logstash.worker: 3 then Filebeat will randomly resolve logstash.service.consul to an IP address three times? Do I also need to enable the loadbalance setting?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.