Elastic Agent: UDP package processing limits

Phoenix2 · October 14, 2025, 3:31pm

I am using the Elastic Agent with CEF integration, which receives and processes UDP data on a port. In the metrics for the cef integration, I see the value file-beat_input.system_packet_drops .
The integration discards UDP packets every minute. Larger amounts of UDP traffic arrive there, but the CPU, network, and RAM are all underutilized.

Does the Elastic Agent have an internal limitation/bottleneck that only allows a certain amount of data to be processed, even if the system on which the agent is running is not actually fully utilized?
Is it necessary to distribute the traffic across multiple integrations/ports as far as possible?

leandrojmp · October 14, 2025, 5:29pm

Yes, there is the internal queue of the agent, which is a memory queue, the size of the queue will depend on the output configuration, what is your output? Elasticsearch? You can change the size of the queue by changing to one of the pre-defined outputs or using a custom configuration, more information can be found here.

Once the queue is full, the input will stop accepting new events until the current events are processed, this does not depend only on the specs of the host running the agent, they can be all underutilized if the destionation can not keep up with the ingestion rate.

So, if your output is Elasticsearch and it cannot index data as fast as you receive the events, then it will apply some backpressure into the Agent, which will then tell the input to slow down, in the case of the UDP input, this means events will be dropped.

To troubleshoot and fix this you need to troubleshoot the entire ingestion flow, from the source to the destination.

Phoenix2 · October 14, 2025, 6:08pm

That's interesting, but I'm still concerned that increasing the queue size will only handle peak loads. Currently, according to the metrics, thousands of UDP packets are being dropped every minute every day. Yes, they are transmitted to Elasticsearch nodes. The agent is managed by fleet and can communicate with multiple data nodes. But even here, the data nodes are far from being fully utilized in terms of CPU, heap, RAM, network, and IO.

How can I investigate or debug this further?

leandrojmp · October 14, 2025, 7:21pm

I think that looking into the logs of both Elasticsearch nodes and Elastic Agent would be a start.

Do you have anything in the logs of the Agent?

Also, what is the event rate?

Phoenix2 · October 18, 2025, 7:05am

I can't see any clues in either the agent logs or the node logs. There are approximately 0.5 million documents and 0.7 million UDP drops per minute over this CEF integration, depending on the time of day.

stephenb · October 18, 2025, 1:55pm

What is the total expected volume of documents and packers per minute?

What is the size of the container or host the agent is running on.

What version of the stack and integration

What is your entire configuration, including the output setting?s

What is the throughput you're getting
What is the throughput you're expecting

The document that @leandrojmp shared, the settings are not just about queue length... There are workers and batch size that can have significant effects on througput

Workers are threads, basically so you may need to tune them up

Topic		Replies	Views
Logstash udp input queue_size Logstash	10	5048	July 6, 2017
Logstash UDP input only processing ~1000 events/sec Logstash	2	502	February 26, 2017
Approach for receiving large amount of messages per second Logstash	3	1123	July 6, 2017
UDP/Netflow Performance Logstash	4	2584	July 6, 2017
UDP input plugin monitoring and elasticsearch output Logstash	5	1484	August 22, 2017

Elastic Agent: UDP package processing limits

Related topics