Grokparse failure seen with tcp input plugin in logstash pipeline

Arinjay_Jain · October 26, 2022, 12:20am

Hi,
I have the following logstash pipeline configuration.

input {
    tcp {
        port => 5102
        codec => plain
    }
}
filter {
    grok {
        match => {"message" => "%{SYSLOGTIMESTAMP:time} %{DATA:trace_name} %{DATA:node_name} %{DATA:count} %{DATA:thread_id} %{GREEDYDATA:data}"}
    }
}
output {
    file {
        path => "%{trace_name}.txt"
        codec => line
    }
}

I have written a client program which connects to logstash server instance on port 5102 and sends log data to it which contains multiple lines of data. I wanted to store all the log data having same trace name in its own file. But in some cases I am seeing that the log data is written to a file named "%{trace_name}.txt" and on debugging further I found that due to grok parse failure. It looks like the data received from TCP socket in input plugin is not processing log data line by line. Whenever the log data received is terminated with a "\n", the grok filter is able to parse the log message successfully, but it fails if the message is truncated.
Can someone suggest what configuration needs to be used so that log data received from TCP socket is processed by the grok filter one line at a time.

Thanks,
Arinjay

Rios · October 26, 2022, 5:48am

The "%{trace_name}.txt" means it's an empty value. Handle errors

Check is your grok pattern always OK, use ruby debugger. You might have 2 grok patterns or "trace_name" as an optional value.
Handle the empty value, something like this

if ![{trace_name]{
  mutate {
	  add_field => { "[trace_name]" =>  "/path/filename.txt" }
  }
}

Handle grok errors in output

 if "_grokparsefailure" in [tags] {
    file {
        path => "/path/grok_error.txt"
        codec => line
    }
 }

Arinjay_Jain · October 26, 2022, 6:07am

Thanks @Rios . I checked the grok pattern in debugger and found that the messages are having 2 different formats. One format has all the fields as defined in the grok pattern above, but other has "count" field missing. Grok parsing is working ok when the message contains all the fields. I think I need to make "count" field optional in the pattern. Can you tell how to make this field optional ? I tried something like below, but it didnt work.

match => {"message" => "%{SYSLOGTIMESTAMP:time} %{DATA:trace_name} %{DATA:node_name} (%{DATA:count})? %{DATA:thread_id} %{GREEDYDATA:data}"}

Thanks,
Arinjay

Rios · October 26, 2022, 6:31am

Space " " means mandatory value. You can use \s* or %{SPACE}, which means zero or more occurrences of space. Try:

%{DATA:time} %{DATA:trace_name} %{DATA:node_name}%{SPACE}(%{DATA:count})?\s*%{DATA:thread_id} %{GREEDYDATA:data}

Arinjay_Jain · October 28, 2022, 5:47pm

Thanks Rios. I was able to fix the grok pattern and see the log messages getting stored with the correct file names.

Thanks,
Arinjay

Rios · October 30, 2022, 11:52am

You welcome.
Just keep consistent, use either regex syntax \s+ or LS syntax %{SPACE} in the grok matching. I intentionally mix both to see that is possible.
Summary:
" " - a single static space, must be separated by only one space character.
\s* - zero or more white spaces. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
\s+ - one or more white spaces
%{SPACE} - same as \s*, as is mentioned in grok patterns

system · November 27, 2022, 11:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Grok pattern OK in debugger, but parse failure in logstash Logstash	5	740	March 25, 2020
Logstash with TCP Input but Error Logstash	1	753	March 28, 2020
Logstash "_grokparsefailure" in output Logstash	6	2110	March 13, 2017
Grok parse failure Logstash	2	429	September 13, 2020
Grok jenkins console logs Logstash	2	1256	September 25, 2019

Grokparse failure seen with tcp input plugin in logstash pipeline

Related topics