Grokparse failure seen with tcp input plugin in logstash pipeline

Hi,
I have the following logstash pipeline configuration.

input {
    tcp {
        port => 5102
        codec => plain
    }
}
filter {
    grok {
        match => {"message" => "%{SYSLOGTIMESTAMP:time} %{DATA:trace_name} %{DATA:node_name} %{DATA:count} %{DATA:thread_id} %{GREEDYDATA:data}"}
    }
}
output {
    file {
        path => "%{trace_name}.txt"
        codec => line
    }
}

I have written a client program which connects to logstash server instance on port 5102 and sends log data to it which contains multiple lines of data. I wanted to store all the log data having same trace name in its own file. But in some cases I am seeing that the log data is written to a file named "%{trace_name}.txt" and on debugging further I found that due to grok parse failure. It looks like the data received from TCP socket in input plugin is not processing log data line by line. Whenever the log data received is terminated with a "\n", the grok filter is able to parse the log message successfully, but it fails if the message is truncated.
Can someone suggest what configuration needs to be used so that log data received from TCP socket is processed by the grok filter one line at a time.

Thanks,
Arinjay

The "%{trace_name}.txt" means it's an empty value. Handle errors

  1. Check is your grok pattern always OK, use ruby debugger. You might have 2 grok patterns or "trace_name" as an optional value.

  2. Handle the empty value, something like this

if ![{trace_name]{
  mutate {
	  add_field => { "[trace_name]" =>  "/path/filename.txt" }
  }
}
  1. Handle grok errors in output
 if "_grokparsefailure" in [tags] {
    file {
        path => "/path/grok_error.txt"
        codec => line
    }
 }

Thanks @Rios . I checked the grok pattern in debugger and found that the messages are having 2 different formats. One format has all the fields as defined in the grok pattern above, but other has "count" field missing. Grok parsing is working ok when the message contains all the fields. I think I need to make "count" field optional in the pattern. Can you tell how to make this field optional ? I tried something like below, but it didnt work.

match => {"message" => "%{SYSLOGTIMESTAMP:time} %{DATA:trace_name} %{DATA:node_name} (%{DATA:count})? %{DATA:thread_id} %{GREEDYDATA:data}"}

Thanks,
Arinjay

Space " " means mandatory value. You can use \s* or %{SPACE}, which means zero or more occurrences of space. Try:

%{DATA:time} %{DATA:trace_name} %{DATA:node_name}%{SPACE}(%{DATA:count})?\s*%{DATA:thread_id} %{GREEDYDATA:data}

1 Like

Thanks Rios. I was able to fix the grok pattern and see the log messages getting stored with the correct file names.

Thanks,
Arinjay

1 Like

You welcome.
Just keep consistent, use either regex syntax \s+ or LS syntax %{SPACE} in the grok matching. I intentionally mix both to see that is possible.
Summary:
" " - a single static space, must be separated by only one space character.
\s* - zero or more white spaces. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
\s+ - one or more white spaces
%{SPACE} - same as \s*, as is mentioned in grok patterns

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.