ElasticSearch low indexing rate

(joseph elia) #1

We have a server that collects IPFIX(netflow v10) packets from an exporter and we are trying to write the data into elasticsearch single node using bulk. the problem is that the indexing rate is very low, our exporter sends ~33000 flow/second however the indexing rate of elasticsearch is ~6000 index/second. We have tried all the recommended settings recommended by Elasticsearch to increase the indexing but nothing was changed. To assure that there's no problem with the importer, we have tried another one and send the data into elastichsearch but the issue persists.

Please advise if any additional action can be done.
Our server specs:
24 Core CPU

(Thiago Souza) #2

What is the spec of the disk for this server?

(Robert Cowart) #3

More important than RAM or CPU is disk. What is the storage configuration of the server? SSD or HDD? How many? SATA/SAS/NVMe? RAID (level?) or JBOD?

Also, what about the network interface. In testing I can easily saturate a 1GB NIC before I hit the limits of Elasticsearch's ability to index data. You need a 10GB NIC.

What does the index template look like? How well is it optimized?

Regarding the IPFIX data... if the device is sending 33K/sec, are you sure that the collector that is sending them to Elasticsearch is actually processing them all in order to forward them? What are you using as a collector? Run netstat -su a couple times on the collector and look to see if you are dropping UDP packets because of buffer errors. If your collector can't keep up (i.e. can't pull PDUs from the buffer as fast as they are arriving) it will manifest itself as buffer receive errors.

I have done a lot of flow data projects at various customers. Netflow, IPFIX, sFlow... all of it. I can tell you now that indexing 33K/s with a single Elasticsearch node is very, very, very unlikely. The challenges you must master are...

  1. How to receive 33K/s without dropping UDP packets (this starts with Linux kernel tuning, understanding things like NIC receive queues to CPU core binding, etc).
  2. How to decode those flows efficiently into JSON documents.
  3. How to structure your index for the most efficient storage of data, without loosing reporting functionality.
  4. Ensuring sufficient storage write IOPS to store the data.

My experience is that 6K/s with a single server is about right. You might be able to squeeze out a bit more than that, but don't forget that you have to read that data at some point. If your storage IOPS is maxed on writes, there is no overhead to actually do queries. You will experience all kinds of problems with reporting.

If I was sizing a cluster for flow data, I would say that you need at least 5 or 6 nodes, depending on how well designed your indices are. That is of course only relevant once the raw collection is worked out.

(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.