Assistance with Logstash Sizing and Kafka Integration

Will3703 · October 18, 2024, 4:42pm

I am new to Elastic and would like to understand the sizing requirements for Logstash based on the following:
• Data to process: 7TB over 10 hours
• Average event size: 70% of events are 1KB, and 30% are 500 bytes

How can I determine the number of Logstash servers needed, considering a persistent disk-based queue? Each Logstash instance will have 16 vCPUs and 32GB of memory. Additionally, how many events per second (EPS) can each Logstash instance handle?
Furthermore, how can integrating Kafka optimize the Logstash deployment? Specifically, can Kafka reduce the number of Logstash instances required? If so, how many instances can be reduced before and after Kafka integration?
Lastly, how does Kafka compare to F5 LTM in terms of handling high throughput?

Thank you for your assistance.

Badger · October 18, 2024, 5:38pm

That really depends upon what the pipeline is doing. Enrichment calls using DNS / elasticsearch / geoip/ JDBC / memcache can add tens of milliseconds to the time taken to process a single event.

That said, if you want to move 7 TB in 10 hours, that's 700 GB per hour. If the average event is 1 KB, that is 700 million events per hour, or about 200,000 events per second. I doubt anyone on the planet is moving that kind of volume through logstash. There are almost certainly better architectures for ultra-high-throughput ETL.

leandrojmp · October 18, 2024, 7:23pm

Helllo and welcome,

I think you are mixing some things.

F5 LTM is a load balancer, it will distribute the request between multiple destinations, Kafka is event store and stream processing tool that can also be used as a message broker or buffer, they had nothing to do with each other.

You would use a load balance when you need to distribute the events between multiple servers for high availability or because one server alone can not keep up with all events and you would use Kafka when you need to have a buffer of events to have high availability, distribute the processing of the events and deal with event spikes.

It is pretty common to use both Load Balancers and Kafka in combination with Logstash.

Also, adding Kafka has no relation in reducing Logstash instances, on the contrary, normally you add Kafka when you need more Logstash instances to process your data.

As an example, I have something closer to 50k events/s and I use both Load Balancers, multiple Logstash instances and Kafka, some logstash instances act as producers for Kafka, they receive the data and send to Kafka, no parsing is done, other logstash instances act as consumers for Kafka, they get data from Kafka topics, parse it and send to Elasticsearch.

To move something closer to 200k events/s as Badger has calculated, your main bottleneck will probably be your output, you can do that with Logstash but it would not be simple, multiple instances, load balancers and maybe Kafka would be required.

I think that the best way to find what kind of infrastructure you will need is by testing it, also, Logstash is more CPU bounded, it does not make much sense to use more than 8 GB of HEAP for it, so 16 GB machines would be fine.

Will3703 · October 19, 2024, 9:19am

Thank you all for the valuable feedback.

I would like to get more clarity on the following points:

Data to process: 7TB over 10 hours
Average event size: 70% of events are 1KB from systems like Windows and Linux with Elastic agents installed, and 30% are 500 bytes from network devices, load balancers, IPS, and firewalls. Firewalls generate the most events, ranging from 12K to 15K EPS.

Can I conservatively assume that a well-optimized Logstash instance with 16 vCPUs and 32GB of memory can handle around 10,000 EPS with an event size of 1KB? If so, here is my calculation:

Number of Logstash Servers: 199,680 EPS / 10,000 EPS = 20 units

Architecture:

Source -------->----------- Kafka x 3 ---->------- Logstash
(Systems)                                                         Logstash
(Network)                                                         Logstash x 20


Lastly, if a physical server is running ESXi with hyperthreading enabled, can I consider each thread as 1 vCPU as well?

Thank you.

Best regards,
William

leandrojmp · October 19, 2024, 3:14pm

No, you need to test, the event rate of a logstash instance also depends on your output, so, if you have a total of 200k e/s but your output can only deal with 100k e/s, your logstash will adjust to that as the output will tell the logstash to backoff and this may or may not lead to delays.

It also will depend on your pipeline filters, what parsing and transformation you are planning to do.

The only way to find the best infrastructure for a use case is by testing.

Also, 32 GB machines for a logstash machine is in most case not needed, you should start smaller and increase the machine size if needed, in most scenarios you should not use more than 8 GB of heap memory, so 16 GB machines would be ok.

Another thing is that not every source can send data directly to Kafka, so you may need something between your sources and kafka, this can also be smaller logstash instances.

When use Kafka you should also match the number of partitions in your topics to the number of logstash instances, for example, if you have 2 logstash instances, your topics should have 2 partitions, this helps evenly balance the events.

Topic		Replies	Views
Logstash Sizing Logstash	7	20145	May 22, 2017
Increase logstash Events Received Rate when using kafka as input Logstash	3	3063	July 29, 2017
Elasticsearch-Logstash Storage and events per second Elasticsearch	2	2927	December 27, 2016
Hardware requirement for logstash Logstash	3	5862	February 3, 2020
Sizing logstash servers Logstash	2	637	December 24, 2018

Assistance with Logstash Sizing and Kafka Integration

Related topics