Another mysterious work logstash with errors _grokparsefailure

again I encounter a problem in the work of logstash, and specifically with grock. Everything is fine in the debugger, the messages are parsed, but as soon as I apply this configuration to the production, then these messages are received with errors, although the second part is successfully parsed. This is not the first time such a problem has occurred. Usually the problem in such cases was solved by restarting logstash, but now it does not work. Is it not stable for you there?

grok debuggers are not grok and there are patterns that work differently in each of them. Show us your grok filter and an example of an event that gets a parse failure.

Thanks for your reply.
Example of a message to be parsed:

10.1.1.2.1688371819767.414890.G_B2C_BETA,07/03/2023 11:10:26.059,sf_sap_error.log,custom,err info,Type: Java Exception / Code:com.audium.server.AudiumException / Message: Incorrect syntax near 'янівка'.

Here is an example of my grock:

%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg}

Custom Patterns:

IDCUST \d+[^.].\d+[^.]
TIMECUST \d{2}/\d{2}/\d{4} %{TIME}
IP (?<![0-9])(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))(?![0-9])
TEXCUST [a-zA-Z0-9-_ ]+
TEXCUST2 [a-zA-Z0-9-_ .]+|

My grok debugger and parsing messages:

Have you tried without a number at begging? I.e. %{TEXCUST2:integration}

What does your Logstash configuration looks like?

Just tested your grok with the message shared and got no issue:

{
                "app" => "G_B2C_BETA",
         "@timestamp" => 2023-07-03T12:37:50.643684857Z,
               "time" => "07/03/2023 11:10:26.059",
    "3rd_integration" => "sf_sap_error.log",
                 "ip" => "10.1.1.2",
           "@version" => "1",
            "message" => "10.1.1.2.1688371819767.414890.G_B2C_BETA,07/03/2023 11:10:26.059,sf_sap_error.log,custom,err info,Type: Java Exception / Code:com.audium.server.AudiumException / Message: Incorrect syntax near 'янівка'.",
               "host" => "lab",
                 "id" => "1688371819767.414890",
                "msg" => "Type: Java Exception / Code:com.audium.server.AudiumException / Message: Incorrect syntax near 'янівка'.",
                 "fl" => "err info",
            "element" => "custom"
}

You need to share your Logstash configuration and some sample of messages that are not being parsed.

I have tested here and only error is in the 3rd_integration field.

Might be related to LS version. @San9 which version LS do you have?

Config logstash

filter {
  if "cc-activity" in [tags] {
        if ([message] =~ "apm_g_999_release,data,floor_2,rep_BR=Передача показів: Покази прийнято") {
    grok {
           patterns_dir => ["/etc/logstash/conf.d/patterns"]
           match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg1}"}
             }
                kv {
                source => "msg1"
                field_split => ";"
                value_split => "="
                }
        }
        else if ([message] =~ "call_oem_put,data,RESP Fer") or ([message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 005") {
                    grok {
                        patterns_dir => ["/etc/logstash/conf.d/patterns"]
                        match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg}"}
             }
        }
        else if [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 005" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 002" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 001" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 2 - 533" or [message] =~ "sf_sap_error.log,custom,err info,Type:" {
                    grok {
                        patterns_dir => ["/etc/logstash/conf.d/patterns"]
                        match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg}"}
             }
        mutate {
                        add_tag => [ "count" ]
                }
        }
	}
 }

error message

10.1.1.2.1688388330123.466601.G_B2C_BETA,07/03/2023 15:45:31.348,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: XML parsing: line 0 character 0 unrecognized input signature

norm message

10.1.1.2.1688389540084.743428.G_B2C_BETA,07/03/2023 16:08:39.502,apm_g_999_release,data,floor_2,rep_BR=OK;v_App_Type=2C;v_OSR=OM;v_RAID=N;mok_Alive=Y;

and

10.1.1.2.1688389564428.743516.G_B2C_BETA,07/03/2023 16:08:20.141,call_oem_put,data,RESP Fer,Message OK.

Hi, My version - 8.3.3-1.
My Grok Debugger parses messages without problems, but in the config itself there are problems with some of the messages that I wrote above.

Can't replicate, the error message works fine for me using that pipeline:

{
                 "fl" => "err info",
                "app" => "G_B2C_BETA",
         "@timestamp" => 2023-07-03T13:45:32.117177908Z,
               "time" => "07/03/2023 15:45:31.348",
           "@version" => "1",
               "host" => "lab",
            "message" => "10.1.1.2.1688388330123.466601.G_B2C_BETA,07/03/2023 15:45:31.348,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: XML parsing: line 0 character 0 unrecognized input signature",
                 "id" => "1688388330123.466601",
               "tags" => [
        [0] "count"
    ],
    "3rd_integration" => "sf_sap_error.log",
                "msg" => "Type: Java Exception / Code: com.audium.server.AudiumException / Message: XML parsing: line 0 character 0 unrecognized input signature",
                 "ip" => "10.1.1.2",
            "element" => "custom"
}

I also can’t find a pattern when in the same config some conditions work, and these are some gaps that cannot be reproduced in the test environment

Anyone else have any ideas why this isn't working as it should?

Without more information is pretty hard to suggest anything, the message you share was parsed without any issue on my test.

Can you shared how it looks like in Kibana, in the Json tab on discover for a document that should've been parsed, but wasn't?

hi, this is what it looks like in kibana:

{
  "_index": "cvp-gcc-activity-2023.07",
  "_id": "1PTpRIkB8lGSjmELmGUp",
  "_version": 1,
  "_score": 0,
  "_source": {
    "event": {},
    "@timestamp": "2023-07-11T12:25:05.807Z",
    "message": "10.1.1.2.1689078291751.433472.G_B2C,07/11/2023 15:25:04.083,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: Incorrect syntax near '�'.",
    "input": {},
    "host": {
      "os": {}
    },
    "ecs": {},
    "log": {
      "file": {
        "path": "C:\\Cisco\\CVP\\VXMLServer\\applications\\G_B2C\\logs\\ActivityLog\\activity_log2023-07-11-14-32-52.txt"
      }
    },
    "@version": "1",
    "agent": {},
    "tags": [
      "gcc-activity",
      "beats_input_codec_plain_applied",
      "_grokparsefailure",
      "count",
      "lst02"
    ]
  },
  "fields": {
    "@timestamp": [
      "2023-07-11T12:25:05.807Z"
    ],
    "message.keyword": [
      "10.1.1.2.1689078291751.433472.G_B2C,07/11/2023 15:25:04.083,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: Incorrect syntax near '�'."
    ],
    "log.file.path": [
      "C:\\Cisco\\CVP\\VXMLServer\\applications\\G_B2C\\logs\\ActivityLog\\activity_log2023-07-11-14-32-52.txt"
    ],
    "@version": [
      "1"
    ],
    "tags.keyword": [
      "gcc-activity",
      "beats_input_codec_plain_applied",
      "_grokparsefailure",
      "count",
      "lst02"
    ],
    "@version.keyword": [
      "1"
    ],
    "message": [
      "10.1.1.2.1689078291751.433472.G_B2C,07/11/2023 15:25:04.083,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: Incorrect syntax near '�'."
    ],
    "log.file.path.keyword": [
      "C:\\Cisco\\CVP\\VXMLServer\\applications\\G_B2C\\logs\\ActivityLog\\activity_log2023-07-11-14-32-52.txt"
    ],
    "tags": [
      "gcc-activity",
      "beats_input_codec_plain_applied",
      "_grokparsefailure",
      "count",
      "lst02"
    ]
  }
}

The grok patterns that you share will successfully parse the two messages that you say get _grokparsefailure tags. That suggests that you are not actually running the configuration that you think you are running.

I suggest triple-checking whether the grok filters and custom patterns are correct in the production environment.

Again I could not replicate your error.

I used the same filter you shared:

filter {
    if "cc-activity" in [tags] {
        if ([message] =~ "apm_g_999_release,data,floor_2,rep_BR=Передача показів: Покази прийнято") {
            grok {
                patterns_dir => ["/opt/logstash/patterns"]
                match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg1}"}
                add_tag => ["first-grok"]
            }
            kv {
                source => "msg1"
                field_split => ";"
                value_split => "="
            }
        } else if ([message] =~ "call_oem_put,data,RESP Fer") or ([message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 005") {
            grok {
                patterns_dir => ["/opt/logstash/patterns"]
                match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg}"}
                add_tag => ["second-grok"]
            }
        } else if [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 005" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 002" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 001" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 2 - 533" or [message] =~ "sf_sap_error.log,custom,err info,Type:" {
            grok {
                patterns_dir => ["/opt/logstash/patterns"]
                match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg}"}
                add_tag => ["third-grok"]
            }
            mutate {
                add_tag => [ "count" ]
            }
        }
    }
}

The only chage I made was to add a tag for each grok to see which one would parse it.

Then I created this sample file:

{ "tags": "gcc-activity", "message": "10.1.1.2.1689078291751.433472.G_B2C,07/11/2023 15:25:04.083,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: Incorrect syntax near '�'."}

And got this result:

{
                "app" => "G_B2C",
         "@timestamp" => 2023-07-11T15:36:25.538603884Z,
               "time" => "07/11/2023 15:25:04.083",
                 "fl" => "err info",
               "host" => "lab",
            "message" => "10.1.1.2.1689078291751.433472.G_B2C,07/11/2023 15:25:04.083,sf_sap_error.log,custom,err info,Type: Java Exception / Code: com.audium.server.AudiumException / Message: Incorrect syntax near '�'.",
                 "id" => "1689078291751.433472",
               "tags" => [
        [0] "gcc-activity",
        [1] "third-grok",
        [2] "count"
    ],
           "@version" => "1",
    "3rd_integration" => "sf_sap_error.log",
                "msg" => "Type: Java Exception / Code: com.audium.server.AudiumException / Message: Incorrect syntax near '�'.",
                 "ip" => "10.1.1.2",
            "element" => "custom"
}

I would suggest the same as @Badger, check if you are realy running the configuration you shared.

Also, add a different tag_on_failure for each grok to help troubleshoot which one is not working.

1 Like

I sent the configuration from the working pipeline, which I currently have running. if I noticed in my two conditions, if else they use the same grock, if I choose another grock, then the situation repeats itself, it feels so good that in the messages I didn’t take into account the special symbol or something else. Since other messages are successfully parsed. I add tag outcc.

I also noticed - when I added the tag "count-grok" you suggested to the configuration, I didn’t see it only that old one --"count"
example:

        else if [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 005" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 002" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 1 - 001" or [message] =~ "sf_sap_error.call_db_parse_error,data,sap_Error,COUNT: 2 - 533" or [message] =~ "sf_sap_error.log,custom,err info,Type:" {
                    grok {
                        patterns_dir => ["/etc/logstash/conf.d/patterns"]
                        match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{TEXCUST2:3rd_integration},%{TEXCUST:element},%{GREEDYDATA:fl},%{GREEDYDATA:msg}"}
                        add_tag => ["count-grok"]
             }
        mutate {
                        add_tag => [ "count" ]
                }
        }

in kibana

In this case, the tag count-grok will only be added if that specific grok is successful.

Since you have many groks, first thing you should do is use a different failure tag for each one so you can know exactly where it is failing.

I'm trying to do this, for each grock a separate tag.
for that problematic message, I simplified and shortened the parsing of messages and everything went well ...
now I will add one value to the parsing and look at what errors there will be.

match => {"message" => "%{IP:ip}.%{IDCUST:id}.%{TEXCUST:app},%{TIMECUST:time},%{GREEDYDATA:msg4}"}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.