Sizing for udp/tcp (syslog type)

miko370 · March 16, 2016, 8:43am

Hi.

Does anyone have any sizing knowledge and experience using tcp/udp input plugin for syslog messages?

magnusbaeck · March 16, 2016, 8:46am

Sizing knowledge? I think you need to be a bit more specific.

miko370 · March 16, 2016, 11:36am

ok.

How many GB per day we can handle per given hardware? What can impact that? Number of CPU's? Number of RegEx's (for example I am going to have dozens of RegEx's that will identify for each message, the vendor and maybe the product e.g Cisco ASA etc)

How do I know if I am starting to have data loss since LS cannot handle a given throughput?

Christian_Dahlqvist · March 16, 2016, 11:49am

When collecting data from UDP and TCP it often makes sense to have the Logstash instance(s) that do the collection do as little processing as possible in order to optimise throughput and just have it write to a message queue. The introduction of a message queue allows you to buffer data and handle peaks better while at the same time decouple collection from processing. You can therefore have a number of Logstash instances read from the message queue in parallel and do the CPU intensive processing without affecting the collection. This layer can be scaled out without affecting the collection layer.

miko370 · March 16, 2016, 11:55am

Thanks Chris for your reply.

Does anyone know - per given hardware - what is the troughput limit in case of TCP/UDP with a minimal processing? How can I monitor a potential data loss?

warkolm · March 16, 2016, 8:20pm

What is the given hardware?

miko370 · March 16, 2016, 9:40pm

Lets start with a basic config:
4 CPU
16 GB memory
1 GbE network card.

warkolm · March 17, 2016, 4:06am

Are you talking purely about parsing the data, or storing it too?

miko370 · March 17, 2016, 11:42am

I am more worried about the input side (Not having data loss cause of a high troughput) and also on the parsing part since I have dozens of regEx for each event.

warkolm · March 17, 2016, 10:14pm

Monitoring data loss is hard, because how do you know it's gone?

Given you have a bunch of regexes, you really will need to test this yourself.

miko370 · March 23, 2016, 7:47am

Hi.

I have seen this slide in the "latest in LogStash" but it does not mentioned the time period. We are probably not talking about EPS,

Does anyone have other benchmarks for the 2.x version?

Topic		Replies	Views
Data loss using UDP input plugin Logstash	19	5164	July 6, 2017
Logstash UDP input keeps losing data Logstash	3	561	May 25, 2022
Logstash Input UDP problem Logstash	2	1459	July 6, 2017
Logstash Memory Requirements and JVM Sizing Logstash	9	3522	May 14, 2020
UDP input plugin monitoring and elasticsearch output Logstash	5	1443	August 22, 2017

Sizing for udp/tcp (syslog type)

Related topics