Hello
Logstash version: 6.2.4
Filebeat version: 6.2.4
We've setup an ELK stack in our company but recently we've been experiencing some issues during the morning ramp-up on our filebeats. The problem is as follows:
Filebeat can't keep up with the logs, at least that's what I can see in the monitoring, initially it's sending above 500 events per second but when this value should increase because the logs increase it starts to decline to a value between 200 & 0 events per second. I tried looking at some configuration options but the only thing I could find is when there are multiple files but in this case there's only one file (haproxy.log) so I can't increase the scan_frequency or other similar options.
This is the current filebeat configuration:
HAProxy logs
- type: log
paths:
- /var/log/haproxy.log
fields:
env: prod
host: lb01.careconnect.be
app: haproxy
document_type: haproxy
fields_under_root: true
Is there any other configuration option of filebeat that I can use to fix this issue?
Thank you for your help
Filebeat can only send data as fast as the downstream systems can accept it. what does your ingest architecture look like? Are you sending data to Elasticsearch? If so, what is the specification of your Elasticsearch cluster?
No, everything is being sent to a logstash server which then parses the logs
This is one logstash server with 10 CPU cores and 8GB of memory, i've assigned 48 (too much but still no issues) pipeline workers to the machine and a JVM Heap of 4GB.
Initially I thought this was a logstash issue but the problem is that nor the JVM heap usage is more than 75% and the CPU is never used over 50%
Is there any more information that can help?
Where does Logstash send the data? Do you have persistent queues enabled in Logstash?
To 2 elasticsearch nodes which also don't present any memory or CPU issues
What is the specification of those nodes? What indexing rates are they seeing?
Both the servers have 4 cpu cores & 16 GB of memory
They have a combined JVM heap of 20GB and an indexing rate of 1000 events/second at night and goes up to 3000-4000 events per second with a latency of 0.1 to 0.15 ms
The maximum usage of JVM heap I've seen on the nodes is about 15GB so 75%
I do see a fall in indexing rate during the issues but I assumed that the cause was because filebeat doesn't send data
Is there anything in the Elasticsearch logs indicating any problems? Are both nodes master-eligible? If so, have you set minimum_master_nodes to 2 to prevent split brain scenarios?
The elasticsearch logs don't show any issues concerning that type of log
That split brain issue on the other hand I still have to fix, nice catch
Do you have any more information to fix this issue?
This is happening every morning now and this doesn't seem to be an elasticsearch issue but more a filebeat issue