How to detect and avoid UDP input loss

I'm using Envoy, which is kind of similar to Nginx, as the gateway of my micro-services backend.
Since it's micro-service, there are five Envoys. All of envoys are deployed by Docker and their logs are sent to my Logstash with the Docker log driver syslog. For now, I used udp to send the logs of the Envoys.

It seems that everything works fine. But I have issues about what I've done.

  1. What if udp loses some logs? For now it seems that nothing has been lost but what if some logs lost? If I worry about this issue, does it mean that I have to choose tcp, instead of udp?
  2. What if there are too many logs to receive? My Logstash needs to receive logs and write them into a local log file. If there are too many logs (let's say I have ten Envoys so more logs would be sent to Logstash), is it possible that Logstash can't hold all of logs? In this case, what will Logstash do? Will it notice some error or warning log to tell me that it's losing logs? Is there some solution to solve the issue? Does Logstash provide some mechanism of cluster deployment to be able to receive more logs?

It depends on what would cause the logs to be lost, for example, if you have some issue during the packet transmission to the logstash port, an UDP packet will be lost, but if you use TCP it may retry the transmission.

But depending on the issue it wouldn't matter using TCP or UDP, for example, if the Logstash server/container is down, you will lost logs no matter the protocol.

Again, it depends on your specific use case. What would be too many in your case? The logstash performance will depends on many things, the number of events per second, the filters used in the pipeline, the output, the resources of the servers and probably a couple more.

In some cases Logstash may not be able to process all the logs it is receiving and this can lead to data loss, but there are ways to deal with that like using persistent queues or even better, using a message broker like Kafka to store the messages before processing.

Logstash won't warn you that you are losing logs, but depending on the output it may log some information about the performance, for example, if the elasticsearch output can not index the documents on the rate that logstash is sending then, it will tell logstash to backoff or throw an error 429 and this may be present in the logs.

Logstash does not work as clusters, each logstash instance is independent from each other, you can have multiple logstash running the same pipeline, but you would need a third party tool like HAProxy to load balance between them.

I always suggest to use Kafka with Logstash, in my use cases I have the following configuration.

Source of Logs > Some Producer > Kafka > Logstash > Elasticsearch

The producer to send the logs to Kafka can be a lot of things, in my cases I have the following tools as producers:

  • Filebeat reading logs and sending to Kafka Topics
  • Smaller logstash instance receiving logs using TCP/UDP and sending to Kafka Topics
  • Vector (from datadog) reading files or listening on TCP/UDP and sending logs to Kafka Topics

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.