I tried both 1-server ES setup (default) and 2-servers setup. Each server has 2x 16core CPUs.
When I start filebeat, it appends log lines into ES with speed about 4000/second.
filebeat process consumes about 15% of CPU core, java (elasticsearch) consumes about 120-150% of CPU. Disks are not overloaded.
So it looks like system resources are mostly idle, but loading speed is low.
Anything I can tune to improve publishing speed until I hit some resource constrain on my server (CPU, disks)?
please properly format logs and config files using the </> button.
Which filebeat version are you using? In 5.x it must say: filebeat.spool_size: and filebeat.publish_async.
With the configuration you have right now load-balancing is not active.
I'd recommend having the filebeat.spooler_size a multitude of workers time bulk_max_size, so batches are split into sub-batches to be properly processed concurrently. This works even is publish_async is disabled.
e.g.:
filebeat.prospectors:
- paths: [ "/data/test.log" ]
fields_under_root: true
pipeline: "test-pipeline"
# spooler size = 2 * 8 * 4096 => split batch into 16 sub-batches to be send concurrently
filebeat.spool_size: 65536
# experimental feature, let's first test without it
# filebeat.publish_async: true
output.elasticsearch:
worker: 8
bulk_max_size: 4096
hosts: ["localhost:9200"]
# if fields.pipeline is not available in event, no pipeline will be used
pipeline: "%{[pipeline]}"
Be carefull to not overload elasticsearch internal pipelines (to big batches or to many workers). In this case some events might not be indexed yet and must be retried by filebeat. This potentially slows down indexing. Check your filebeat log files.
As throughput for beats very much depends on indexing performance in ES, you might wanna play with 'worker', 'spool_size' and 'bulk_max_size' a little and see if you can adapt throughput somewhat more. Do you have publish_async enabled?
PS: shouldn't filebeat report an error when I use (nonexistent) "spool_size" rather than "filebeat.spool_size" to avoid such a confusion?
We're constantly improving configuration loading. See go-ucfg. Unfortunately we can not detect typos or putting configs on the wrong level yet. Related tickets #10, #11, #6.
using publish_async filebeat prepares some batches in memory for sending, that is there is some slight latency-overhead if setting is disabled. I'd keep it disabled if possible. You can also try to increase the spool_size and workers and see if performance still improves somewhat, but maybe you're close to hitting a limit.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.