KV filter and fields separator

stephenb · May 18, 2021, 3:53pm

Here is another what to do it without grok at all

FIXED

 filter {

  # Without GROK
  mutate {
    gsub => [
      # Clean Up Beginning
      "message", '^\"', '',
      # Clean Up End
      "message", '}\"', '}',
      "message", ';;', '',
      # Clean up bad characters
      "message", '\n', '',
      "message", '""', '"'
    ]
  }

  json {
        source => "message"
  }

#   # OR WITH GROK 
#   grok {
#     match => { "message" => "\"%{DATA:message_detail}\";;" }
#     }

#   mutate {
#     gsub => [
#       "message_detail", '\n', '',
#       "message_detail", '""', '"'
#     ]
#   }

#   json {
#         source => "message_detail"
#   }

}

qttv · May 18, 2021, 3:55pm

When you write \n what you mean? What part of the log is it indicating? Same thing for '\"\"'... is a way to indicate the double quotation marks?

stephenb · May 18, 2021, 3:58pm

Take out the json block and the mutate block look at what the raw message looks like after the grok.

  # mutate {
  #   gsub => [
  #     "message_detail", '\n', '',
  #     "message_detail", '\"\"', '"'
  #   ]
  # }
  # json {
  #       source => "message_detail"
  # }

Example the raw message looks like this after the grok (sorry I pasted wrong one the first time)

"message_detail" => "{\n \"\"message\"\": \"\"Throw: Il prodotto estratto non è presente tra quelli del menu a tendina in fase di emissione della proposta\"\",\n \"\"level\"\": \"\"Error\"\",\n \"\"logType\"\": \"\"Default\"\",\n \"\"timeStamp\"\": \"\"2021-01-01T01:03:35.1546269+01:00\"\",\n \"\"fingerprint\"\": \"\"5f7b1c28-8f81-4cd2-afff-37a6f893ea4a\"\",\n \"\"windowsIdentity\"\": \"\"GANIT\\\\RVDI001\"\",\n \"\"machineName\"\": \"\"CL-W10RBT-003\"\",\n \"\"processName\"\": \"\"CorrettaAssunzioneANIA_Worker_Win10Produzione\"\",\n \"\"processVersion\"\": \"\"1.0.86\"\",\n \"\"jobId\"\": \"\"389178ee-a728-44be-8ef1-511060465a5e\"\",\n \"\"robotName\"\": \"\"001-VDI-Produzione\"\",\n \"\"machineId\"\": 22,\n \"\"fileName\"\": \"\"3.Emissione_Nuova_Proposta\"\",\n \"\"transactionId\"\": \"\"1fad05cb-9588-484b-b5de-a52c5e9fd29c\"\",\n \"\"queueName\"\": \"\"CorrettaAssunzioneANIA_PROD_INPUT\"\"\n}"

So I am just going through and cleaning it up with gsub.... exactly as it looks.

I gave you 2 ways play with it ... Just go through taking parts in and out and you will learn.

I noticed this worked as well without the \s

mutate {
    gsub => [
      "message_detail", '\n', '',
      "message_detail", '""', '"'
    ]
  }

Good Luck!

qttv · May 19, 2021, 12:13pm

Good Morning @stephenb. I've understood your config file without the GROK.
About the one containing the GROK i didn't understand what the DATA pattern really do and why it contain the ;; between the quotation marks (never seen a syntax like this). I tried the GROK pattern on the GROK DEBUGGER but it gives me an error. Plus, why it contains backslash while in the grok debugger these are not needed?

Now that the JASON logs are well parsed (I thank you again for it) I would like to import in my Elastic Search a CSV in wich the JSON logs rapresent the content of only one column. I show you and example of the raw text that I need to import:

ID;a;b;DATE;c;MACHINE;INFO;KEY;NAME;INFO2;LOG;MachineID;;
52449999;1;1;2021-01-01 00:00:38.707;4;GANIT\RVDI001;CorrettaAssunzioneANIA_Worker_Win10Produzione;389178EE-A728-44BE-8EF1-511060465A5E;001-VDI-Produzione;Throw: Il prodotto estratto non è presente tra quelli del menu a tendina in fase di emissione della proposta;"{
  ""message"": ""Throw: Il prodotto estratto non è presente tra quelli del menu a tendina in fase di emissione della proposta"",
  ""level"": ""Error"",
  ""logType"": ""Default"",
  ""timeStamp"": ""2021-01-01T01:00:38.706009+01:00"",
  ""fingerprint"": ""bf929041-9964-496c-8eee-e1ba7ce4e42a"",
  ""windowsIdentity"": ""GANIT\\RVDI001"",
  ""machineName"": ""CL-W10RBT-003"",
  ""processName"": ""CorrettaAssunzioneANIA_Worker_Win10Produzione"",
  ""processVersion"": ""1.0.86"",
  ""jobId"": ""389178ee-a728-44be-8ef1-511060465a5e"",
  ""robotName"": ""001-VDI-Produzione"",
  ""machineId"": 22,
  ""fileName"": ""3.Emissione_Nuova_Proposta"",
  ""transactionId"": ""81152118-8b17-4a3f-aa96-a3e92de402f5"",
  ""queueName"": ""CorrettaAssunzioneANIA_PROD_INPUT""
}";22;;

The first line provide the name of the columns. Is there a way to import all this fields and even split the content of the log as we already did?

system · June 16, 2021, 12:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash unable to parse with kv filter Logstash	18	832	August 11, 2020
Saving a string from an array of strings as a field Logstash	2	3365	February 8, 2018
Extract a specific part of a kv filter Logstash	1	513	October 8, 2018
Logstash kv filter extract json array issue Logstash	7	2123	February 13, 2019
Kv filter's behavior when square brackets in log data Logstash	2	1180	August 18, 2017

KV filter and fields separator

Related topics