TCP input without delimiter


I am trying to set up a logstash instance collecting data from a TCP Input. My pipeline is set and I am able to send data with a telnet.

For now my pipeline looks like this:

input {
  tcp {
    port => 1234
output {
  file {
    path => "/tmp/tcp_test.json"
    code => "json_lines"

My tcp_test.log file is properly created when I telnet data to my logstash and everything seems to work well.

Things get complicated when I try to send real data. My firewall is configured to send data but nothing appears. A tcpdump shows that data is reaching the logstash server on the right port but my file stays empty. But, when I reload the pipeline, the file is finally created with only one line containing a single humongous JSON document in which all the logs are concatenated into one single very long message. I captured the traffic and noticed that every logs are sent into its own TCP frame without any delimiter (\n, \r\n, \0, ...) at the end of the log.

If I understand correctly how the TCP input plugin works, this explains why nothing appears in my file until my pipeline is stopped: the input waits for a delimiter to come but since there is none it considers the full data coming in as a single log, whatever its size is.

Is there a way to specify to the input that the log should be delimited by the TCP frame only? Or may be I am not using the right codec (i tried the default "line" codec and the plain codec but the later is ignored to be replace by the "line" one, see Logstash - wrong codec in tcp input plugin?)

Many thanks in advance!

TCP is a stream oriented protocol. You need to have a message framing protocol on top of it. That's typically done using a delimiter.

1 Like

Hi @Badger and thank you for your answer. I have investigated a little more since the opening of this discussion and I will add some observations about my previous statement.

The devices sending us logs follow the RFC5424 to send syslog messages. This RFC seems to allow formatting the messages as " " without adding delimiters between messages. As an example, sending the following:

some message
some other message

would give the following network capture:

12 some message18 some other message

A rsyslog server with standard configuration receiving those logs through a TCP socket would write each message on its own line in a file.

A solution could be to use a syslog input with the grok_pattern option correctly set (and even this could be tricky).

But (because there is always a but) I will need to cipher data to ensure data confidentiality during the transport and, unfortunately, the syslog input does not seem to support those options.

Do you have some way to make it work without putting a rsyslog in front of the logstash to make the translation?

Thank you

So it looks like the framing in your case is to add a message length preceeding the message. I am not aware of any input that can handle that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.