Loss of performance with Cisco module

Hello,

We have being trying to import our logs coming from a Cisco ASA into the ElasticSearch using Filebeat.

The Cisco ASA is sending an average of 8,000 events/seconds. At first we were importing them as syslog listening on UDP 514, but the rate was too slow so as a temporary solution,we are logging the events to disk with rsyslogd, and Filebeat reads the events off that file.

When the Cisco module is disabled, monitoring in Kibana shows an average events rate of 8,000/s, but when we enable the Cisco module, the rate lowers down to 500/s.

I have tried to tweak the settings of filebeat.yml to increase this rate, like bulk_max_size and worker. Changing worker makes a small difference, but the value of bulk_max_size has no visible impact.

This is running on a VM with 64GB of RAM, and 24 cores.
Here's the filebeat configuration file :

xpack.monitoring.enabled: true

filebeat.config.modules:
  enabled: true
  path: ${path.config}/modules.d/*.yml

output.elasticsearch:
  hosts: ["http://localhost:9200"]
  worker: 8
  bulk_max_size: 1000

Any suggestions on how to fix this issue?

Thank you

Hi,

Ok so ... 8k event /s is HUGE can i have precise info on the hardware used ?

You might have to use more than one elastic cluster as well as a bunch of SSD.

Have to tried to ship logs to logstash instead of sending them directly to elastic ? this might improve performances.

Hi,

Is this really so much for a modern computer ? The volume of log data coming from the ASA firewall is only about 1 MByte/sec. It's being stored on a non-SSD hard drive, but this happens after the Cisco parsing step, which appears to be the bottleneck as it works fine when the Cisco module is disabled.

I thought of using Logstash, but I haven't tried it yet.

Yes, you are way above average with 8k events per second, i might suggest you to take a look into elastic clusterization and workers.

When the Cisco module is disabled I assume the event contains very few fields, which is easier to index than more complex events. The fact that one works and the other does not does not necessarily remove Elasticsearch as a potential bottleneck, especially as you seem to be using spinning disks. What does disk I/O and iowait look like when you are indexing enriched events?

Yes it is.

It is running around 20MB/s with the module enabled and the iowait was very low (around 0.04%). We performed a test by copying a very big file and the resulting i/o was around 200MB/s.

I can still try the Logstash solution like @grumo35 suggested, but that doesn't really explain why changing bulk_max_size stops having an impact on the indexing rate once the Cisco module is enabled.

Copying a very large file is a poor way to judge how well a certain type of storage will work with Elasticsearch as it involved a lot of sequential read and writes while Elasticsearch generally uses random reads and writes.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.