Grok parse issue parsing previously parsed string



We are having issues with our syntax writing, where we try to extract a length of numbers from a long string that continues on both ends of the wanted piece of numbers. Our theory is that the problem lies within our use of %{GREEDYDATA} to separate the string. We have tried substituting %{GREEDYDATA} for %{NUMBER} with the same result; a _grokparsefailure. The string we are handling is already parsed, does that have to do with it? Any thoughts?

grok {
    match => { "[doc][Message]" => "%{GREEDYDATA:Bloat}	Case:	%{GREEDYDATA:Case}"}

mutate {
    add_field => [ "CommandID", "%{[doc][Message][1]}" ]

if ("CloseND" in [CommandID]) {
	grok {
	    match => { "Bloat" => "%{GREEDYDATA:Bloat2}Time (ms):	%{GREEDYDATA:SessionLength}"}

We receive CommandID from elsewere with no issues.

(Magnus Bäck) #2

What does an example event look like? Use a stdout { codec => rubydebug } output to dump the raw event produced by Logstash.

           "offset" => 80223,
       "prospector" => {
        "type" => "log"
              "doc" => {
              "Time" => "2018-03-03T10:38:06.2677861+00:00",
             "Level" => "Info",
              "User" => "TaGer1",
            "Method" => "Utilities.DefaultLogs.CommandUsageLog::Info",
         "AppDomain" => "DefaultDomain [1]",
           "Message" => [
            [0] "3D3D3D3D-3D3D-3D3D-3D3D-3D3D3D3D3D3D",
            [1] "CloseND",
            [2] "Session GUID: f0817cea-63ee-4d6d-a7fa-ab47404b91aa",
            [3] "Time (ms):",
            [4] "135625",
            [5] "Case:",
            [6] "7515ebae-5918-4e9a-b59c-5c246028a0e9"
            "Thread" => "[1]",
              "Host" => "DESKTOP-S3T0CB1",
        "Categories" => "client.performance.commands, common.assembly.utilities"
    "filetimestamp" => "",
             "host" => "LKP-C-SI-SET-1",
       "@timestamp" => 2018-05-02T11:47:29.200Z,
           "source" => "C:\\Users\\si-set\\Desktop\\CommandUsage\\Commandusage.w3wp.default.",
            "Bloat" => "3D3D3D3D-3D3D-3D3D-3D3D-3D3D3D3D3D3D\tCloseND\tSession GUID: f0817cea-63ee-4d6d-a7fa-ab47404b91aa\tTime (ms):\t135625",
             "path" => "C:\\Users\\si-set\\Desktop\\CommandUsage",
          "logtype" => "Commandusage",
         "@version" => "1",
             "beat" => {
            "name" => "LKP-C-SI-SET-1",
         "version" => "6.2.2",
        "hostname" => "LKP-C-SI-SET-1"
             "Case" => "7515ebae-5918-4e9a-b59c-5c246028a0e9",
        "CommandID" => "CloseND",
             "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_grokparsefailure"

If the CommandID is CloseND we want to grokparse the Time (ms): from Bloat, preferably as a number and not a string if it doesn't have a significant impact on performance.

(Magnus Bäck) #4

[doc][Message] is an array. I'm not sure the grok filter deals with that at all.


We split [doc][Message] into an array after the first grok match. The case is not always at the same array index so we want to extract it before splicing [doc][Message]. And after that we want to match Time (ms) only if the CommandID is CloseND so we try to take it from the Bloat field we make after the first grok match and splice, which we do to compare and extract CommandID.

We can include the whole config if you want to see it.


It was this grok match that was failing as we used parentheses as delimiter without marking them as a character with \.

match => { "Bloat" => "%{GREEDYDATA:Bloat2}Time (ms):	%{GREEDYDATA:SessionLength}"}

This worked.

match => { "Bloat" => "%{GREEDYDATA:Bloat2}Time \(ms\):	%{GREEDYDATA:SessionLength}"}

(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.