I have two nodes on which the Logstash and Elastic runs.
At the moment I am using only one node in production for Logstash, 50 pipelines, processing approximately 600,000 records per 1h (running scheduler with http_poller). CPU is 16 threads, AMD x1900 thread ripper on every node (https://www.newegg.com/Product/Product.aspx?Item=N82E16819113457).
Hence two questions about recommendations:
If I use the same Logstash configuration on both nodes that will load data to the same cluster, will both Logstashes load the same data twice? Because clustering is not possible, the pipe load balancing using Logstash version 6.4.0 is only possible by splitting the config and running 50% of it on 1st node, another 50% on 2nd node.
pipeline.batch.size: 125 ( Values in excess of the optimum range cause performance degradation due to frequent garbage collection or JVM crashes related to out-of-memory exceptions. )
According to Kibana, I was never above 3 GB of RAM.
Does it make sense to increase the batch.size?