Netflow gaps between filebeat and elasticsearch

Hi everyone,

I have been struggeling with parsing netflow from my Cisco FTD (cluster mode) with around 12k events per second. Through filebeat -> elasticsearch.
(Using version 7.6.0 for filebeat/elastic)

I have 2 servers, 1 for filebeat the other for elasticsearch. They have the following hardware config:
Filebeat: 8xCPU, 16Gb Memory, 250Gb storage (SSD).
Elastic: 16xCPU, 32Gb Memory, 7Tb storage (SSD).
Networking is 10gb.
(probably a bit oversized but we can change that in the future)

On my filebeat node i can see a constant flow of netflow packets comming from my firewall. (I checked this with tcpdump -nni any port 2055). But when i do a tcpdump on the output side of filebeat to elastic (tcpdump -nny any port 9200) i can see that sometimes filebeat stops sending data to my elastic node all together but resumes after some time. You can see this in this picture:

When using htop during these outages i can see barely any usage on my CPU and memory. So it looks like filebeat is dropping traffic to my elastic node but i cant figure out why. I have been toying around with the filebeat output settings:

    bulk_max_size: 4096
          worker: 2

I tryed smaller bulk sizes and more or less workers. also i have been testing the queue.mem settings from filebeat with multiple settings:

          events: 4096
          flush.min_events: 0

On the elastic side i temporarely disabled replication and changed index refresh to 30seconds.

Could anyone point me in the right direction or could give me any insight?

Thanks in advance

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.