We've recently been asked to ship a portion of our logs to a client at their datacenter. We are currently using the syslog output module to send logs from logstash to another internal device at the end of our pipeline.
I'm thinking that the easiest way to get the logs to this 3rd party may be to use the syslog output module to send them the requested logs but I have a couple of questions about this.
is it possible to encrypt the transfer of these logs using the syslog output module?
if we lose network connectivity to the destination syslog host, can we trust the syslog output module to queue these logs and if so for how long?
as for options, it depends on what the client is able to receive, if it's an http endpoint, there's the http output (also supports ssl), if it's plain data you can use a simple tcp socket through the tcp output (also supports ssl)
thanks @jsvd. Do you know how well it handles queuing. Other than just testing it to see what happens I'm wondering how it will handle the 3rd party endpoint going down for a few hours to a day?
@dfinn it will retry indefinitely, which means it will apply backpressure to the logstash inputs, since the internal queue will be filled, if you're using the in-memory queue.
If you enable persistent queues in logstash it will be able to continue receiving events until the capacity levels configured for the PQ are hit, then it will apply backpressure to the inputs.
What @jsvd says is right, it will retry indefinitely.
However, TCP and UDP syslog streams are both vulnerable to data loss, so when a connectivity problem occurs, even with TCP, some of the last-sent data can be lost and Logstash can do nothing to prevent this (it's a problem with TCP).
That said, when the plugin detects a failure, it will retry-until-successful sending the last transmission.
Excellent, thanks for the feedback. Sounds like this may work for us. We'll have to do some testing on what happens if the endpoint goes down for an extended period of time. We don't expect this to happen but this is the first time we have a requirement to ship logs to a 3rd party so it's new to us.
@jvsd or @jordansissel, just getting back to this now and had one other question regarding queueing. Without enabling PQs, what is the size of the in memory queue? I'm trying to determine whether that will be good enough for us or if we need to look into enabling PQ.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.