I am new to the ELK Stack. I wanted to know if we could test the performance of filebeat. I am developing a system that would have a huge amount of logs generated within a short amount of time. Therefore, it is imperative for me to know what amount of logs can filebeat send or handle. I know it depends on the hardware as well but still how should I test this? Also, if there is way to optimize this, how should I go about that. Just for information, I am using filebeat version 1.2.3
Performance will pretty much depend on your complete chain of processing. Filebeat itself can tail files pretty fast. Especially when content is still in file cache. But once you add logstash, redis, kafka or elasticsearch, performance will highly depend on network and ingest rate of your destination, as filebeat will slow down on back-pressure.
For testing you first need a source. 2 Options:
a prepared few hundred megabytes log files
acustom script writing random log lines with configurable rate simulating a real process (rate can be dynamic for simulating peek times)
Having a prepared log-file gives you an easy start. E.g. NASA HTTP log. You can multiply log files by copying content multiple times in destination log file like $ cat in.log in.log in.log > test.log.
filebeat can export some stats via -httpprof :6060 flag. Use expvar_rates.py script to collect some stats.
See this post for some more tips and how to use expvar_rate.py script.
There is also collectbeat, another beat collecting same information as expvar_rates.py, but forwards to elasticsearch. I use it with master branch, so no idea if it works with 1.x release.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.