Hi All, when harvesting logs from my machines to logstash using filebeat I noticed logs are not delivered fast enough, especially when using TRACE logs and adding more machine (filebeats).
My stack is something like ~40 machines each writing ~20 logs per second and one logstash with 2 CPU and 4GB RAM (aws t2.medium) - machine metrics looks good and it's not looks like logstash having tough life while processing the logs, but still I can see delay of something like ~1 hour of logs being harvested and send to logstash.
What is the ratio of logs per sec to number of logstash ? how much logstash instances needed? what size of logstash instances ? how can I make filebeat harvest faster ?
I would be surprised if the Filebeat harvester is the bottleneck.
I do not think t2.medium is a very suitable instance type as it has quite limited, burstable CPU allocated, so it could very well be that your Logstash instance is the bottleneck. I would recommend upgrading to a m4/m5.large instance instead and see if that improves the throughput. You also need to make sure that Elasticsearch (or any other output used) is able to process data fast enough as this also will limit Logstash throughput.
Hi Christian, thank for the reply I'll try to increase the machine to m5.large and see if there is any change but as I said the machine stats are very low and I don't think the machine size comes into account here.
I'm using 2 outputs: s3 and 3 nodes m5.large elasticsearch cluster aren't the logs delivers to outputs asynchronously ? how come the outputs are the bottleneck and how can I measure that ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.