2 different logs match the filter, but only one is not parsed

Hello, I have a standalone setup, and recently, one of my machines is reaching the server with several different logs, mostly differing in the "syslog_pri". One of them is not being parsed, and getting grokparsefailure, however, when I go to a grok tester, that specific log matches the same filter in the same way the others do.
my input is a tcp/udp input from other machines, and the filter is a custom build for the different tags I use in the input, and then there is the output to Kibana.
Is there any known case of different logs that match the same grok filter have different outcomes?

If you are getting _grokparsefailure then please show us the grok filter configuration and an example of a log entry that it fails to parse.

grok used(after the first one, that it is not supporse to match, hence the if):
if ("_grokparsefailure" in [tags]) {
#try another match
#test for omniswitch - two potential matches
grok {
match => {
"message" => [ "<%{NONNEGINT:syslog_pri}>[\s](?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp})[\s]%{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" ,"<%{NONNEGINT:syslog_pri}>[\s](?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp})[\s]%{SYSLOGHOST:syslog_hostname} %{SYSLOGPROG}%{DATA}[\s\n]%{GREEDYDATA:syslog_message}", "(?m)<%{NONNEGINT:syslog_pri}>(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp}) %{SYSLOGPROG}%{DATA}[\s\n]%{GREEDYDATA:syslog_message}" ] }
add_field => { "source_type" => "omniswitch" }
remove_tag => [ "_grokparsefailure" ]
}
}

Log example:
This one matches that grok filter but it still gets _grokparsefailure
<46>2019-07-04T09:35:40.100139Z 0.0.0.0 dpiEventLog-1/4 - - - 87f95d94 0a0fa86c 57324 443 6 from-sub permit sap 1/3/8:0 session-filter RDNET_INCOMING_GENERAL 101

This one matches the grok filter but does not show _grokparsefailure at the end:
<150>2019-07-04T09:40:51+00:00 localhost haproxy[17408]: 0.0.0.0:56774 [04/Jul/2019:09:39:08.961] localhost~ api_hosts/172.16.1.6 44/0/1/28/102139 101 222 - - sD-- 85/85/80/22/0 0/0 "GET /api/v1/namespaces/default/pods/1-60-ag-master-1450-build-19a-l3c5g-sc3p4/exec?command=/bin/sh&container=build-195&stdin=true&stdout=true&stderr=true HTTP/1.1"

As you say, that line matches that pattern and grok parses it OK. That means you have at least one more grok filter after that.

If you point path.config at a directory then logstash will concatenate every file in the directory to build the configuration, that would include files such as logstash.conf.bak or logstash.config-

In order to test whether this is happening you could try running with '--log.level debug --config.test_and_exit' so that it shows you the configuration it is using.

Correct, I have that setup, where I have different .config files for each step (input, filter, output).
At the input phase, these logs are tagged as "syslog_udp", and they go the the respective filter, ignoring the others.
That filter consists of the following:
filter{
if [type] == "syslog_udp" {
#parse 'message' field
#first match is default multiline match
grok {
match => {
"message" => ["(?m)<%{NONNEGINT:syslog_pri}>(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp}) %{SYSLOGPROG}%{DATA:test}[\s\n][ID %{DATA:message_id}] %{GREEDYDATA:syslog_message}", "(?m)<%{NONNEGINT:syslog_pri}>(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp}) \t%{GREEDYDATA:syslog_message}"]}
add_field => { "source_type" => "solaris" }
}

    if ("_grokparsefailure" in [tags]) {
        #try another match
        #test for omniswitch - two potential matches
        grok {
            match => {
                "message" => [ "<%{NONNEGINT:syslog_pri}>[\s]*(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp})[\s]*%{SYSLOGHOST:syslog_hostname} %{GREEDYDATA:syslog_message}" ,"<%{NONNEGINT:syslog_pri}>[\s]*(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp})[\s]*%{SYSLOGHOST:syslog_hostname} %{SYSLOGPROG}%{DATA}[\s\n]%{GREEDYDATA:syslog_message}", "(?m)<%{NONNEGINT:syslog_pri}>(?:%{SYSLOGTIMESTAMP:syslog_timestamp}|%{TIMESTAMP_ISO8601:syslog_timestamp}) %{SYSLOGPROG}%{DATA}[\s\n]%{GREEDYDATA:syslog_message}" ] }
            add_field => { "source_type" => "omniswitch" }
            remove_tag => [ "_grokparsefailure" ]
        }
    }

    if ("_grokparsefailure" in [tags]) {
        #test for omniswitch old style
        #custom case insensitive (?i) match for messages of type "<134>THU JUL 02 09:32:32 IPv6(38) Data: Port 1/7 (gport 6) down"
        grok {
            match => {
                "message" => ["(?i)<%{NONNEGINT:syslog_pri}>%{DAY} (?:%{SYSLOGTIMESTAMP:syslog_timestamp})[\s]+%{SYSLOGPROG}%{DATA}[\s\n]%{GREEDYDATA:syslog_message}"]}
            add_field => { "source_type" => "omniswitch_old" }
            remove_tag => [ "_grokparsefailure" ]
        }
    }

    #check for 7750
    if ([program] == "7750:") {
        #7750 - replace existing field by 7750
        mutate {
            replace => { "source_type" => "7750" }
        }
    }

    #check for 6450
    if ([program] =~ /6450/) {
        #6450 - replace existing field by omniswitch
        mutate {
            replace => { "source_type" => "omniswitch" }
        }
    }
    #save original @timestamp as 'received_at'
    #as further we will overwrite @timestamp with 'syslog_timestamp'
    #need to use below workaround,
    #see http://stackoverflow.com/questions/25189872/logstash-how-to-make-a-copy-of-the-timestamp-field-while-maintaining-the-same
    ruby {
        code => "event['received_at'] = event['@timestamp']"
    }

    #parse 'host' field into ip/hostname and optional 'port' field
    #not saving 'port' field
    #need to make distinction between IPv4 and IPv6 for now,
    #see https://github.com/elastic/elasticsearch/issues/3714
    grok {
        match => { "host" => "(?:%{IPV4:received_from_ipv4}|%{IPV6:received_from_ipv6}|%{HOSTNAME:received_from_hostname})(:%{POSINT})?" }
    }

    #extract facility code and severity from syslog_pri
    syslog_pri { }

    #replace @timestamp with time of 'syslog_timestamp'
    date {
           match => [ "syslog_timestamp", "ISO8601", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
    }

    #remove fields that are no longer necessary
    mutate {
            #remove_field => ["syslog_timestamp", "host"]
            gsub => ["message", ">1 ", ">"]
            #gsub => ["message", "Z ", "+00:00"]
    }
}

}

It is a bit of a mess, however, one goes through it fine, another not so well.
This is an old version, and I wanted to see if I could put it to work still, the version is 2.3.4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.