Stop grok from adding backslash automatically to escape \

I have the following log:
2018-02-27 04:54 +00:00: {"name":"sails","hostname":"ip-111-11-11-111","pid":256213,"level":30,"res":{"statusCode":200,"header":"HTTP/1.1 200 OK\r\nX-Powered-By: Sails <sailsjs.org>\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 625\r\nETag: W/\"271-sZ/4mGWWq9dsamzfgSqtaVpiw\"\r\nset-cookie: sails.sid=s%3AxAU5sxGlztcbOJBC79xWoUuzNtqkwdasqnX.Js%2FpjHdQst0OVdsankNN4vMj6IY5Rm9xbCI40K7KLoJWJiI; Path=/; HttpOnly\r\nDate: Wed, 27 Jan 2016 04:54:40 GMT\r\nConnection: close\r\n\r\n"},"msg":"finished","time":"2018-02-27T04:54:40.920Z","v":0}

and I am trying to isolate JSON to use JSON filter on it. Here's the Grok construction

`(?<time>.{24}) (?<json>.*)`

which gives:
{ "time": [ "2018-02-27 04:54 +00:00:" ], "json": [ "{"name":"sails","hostname":"ip-172-31-22-118","pid":25619,"level":30,"res":{"statusCode":200,"header":"HTTP/1.1 200 OK\\r\\nX-Powered-By: Sails <sailsjs.org>\\r\\nContent-Type: text/html; charset=utf-8\\r\\nContent-Length: 625\\r\\nETag: W/\\"271-sZ/4mGWWq9mzfgSqtaVpiw\\"\\r\\nset-cookie: sails.sid=s%3AxAU5sxGlztcbOJBC79xWoUuzNtqkwqnX.Js%2FpjHQst0OVnkNN4vMj6IY5Rm9xbCI40K7KLoJWJiI; Path=/; HttpOnly\\r\\nDate: Tue, 27 Feb 2018 04:54:40 GMT\\r\\nConnection: close\\r\\n\\r\\n"},"msg":"finished","time":"2018-02-27T04:54:40.920Z","v":0}" ] }

The problem is this JSON can't be parsed as Grok is adding \ automatically to other present \

So the json field contains an array of JSON strings? In that case you're okay. You have two levels of JSON escaping so it's natural that a newline character becomes \\n.

Not really array of json strings. Yes That works for \n , but in this case strings are already escaped in fields \' so when Grok adds another backslash to escaped char it becomes \\' which doesnt make the JSON parsable.

The grok filter doesn't add backslashes to the data it captures.

The data you posted above can't be parsed because

"json": [ "{"name":"sails"

isn't valid. Therefore I don't even know what your events currently look like and then I can't help out.

No, the json you are pointing out is just a name given to the field before parsing it using json filter. So below is the JSON which is not parsable:

{
   "name":"sails",
   "hostname":"ip-XXX-XX-XX-XXX",
   "pid":25619,
   "level":330,
   "res":{
       "statusCode":200,
       "header":"HTTP/1.1 200 OK\\r\\nX-Powered-By: Sails <sailsjs.org>\\r\\nContent-Type: text/html; charset=utf-8\\r\\nContent-Length: 625\\r\\nETag: W/\\"271-sZ/4mGWWq9mzfgSqtaVpiw\\"\\r\\nset-cookie:sails.sid=s%3AxAU5sxGlztcbOJB312C79xWoUuzNtqkwqnX.Js%2FpjHQst0OqeVnk13NN4vMj6IY5Rm9xbCI43120K7KLoJWJiI; Path=/; HttpOnly\\r\\nDate: Tue, 27 Feb 2018 04:54:40 GMT\\r\\nConnection: close\\r\\n\\r\\n"
     },
    "msg":"finished",
    "time":"2018-02-27T04:54:40.920Z",
    "v":0
}

So the actual problem is in the header field, it is adding \ to the header value "header":"HTTP/1.1 200 OK\\r\\nX-Powered-By: Sails <sailsjs.org>\\r\\nContent-Type: text/html; charset=utf-8\\r\\nContent-Length: 625\\r\\nETag: W/\\"271-sZ/4mGWWq9mzfgSqtaVpiw\\"\\r\\nset-cookie:sails.sid=s%3AxAU5sxGlztcbOJB312C79xWoUuzNtqkwqnX.Js%2FpjHQst0OqeVnk13NN4vMj6IY5Rm9xbCI43120K7KLoJWJiI; Path=/; HttpOnly\\r\\nDate: Tue, 27 Feb 2018 04:54:40 GMT\\r\\nConnection: close\\r\\n\\r\\n"

So below is the JSON which is not parsable:

That's not what I get.

$ cat test.config 
input { stdin { } }
output { stdout { codec => rubydebug } }
filter {
  grok {
    match => ["message", "(?<time>.{24}) (?<json>.*)"]
  }
  json {
    source => "json"
  }
}
$ cat data 
2018-02-27 04:54 +00:00: {"name":"sails","hostname":"ip-111-11-11-111","pid":256213,"level":30,"res":{"statusCode":200,"header":"HTTP/1.1 200 OK\r\nX-Powered-By: Sails <sailsjs.org>\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 625\r\nETag: W/\"271-sZ/4mGWWq9dsamzfgSqtaVpiw\"\r\nset-cookie: sails.sid=s%3AxAU5sxGlztcbOJBC79xWoUuzNtqkwdasqnX.Js%2FpjHdQst0OVdsankNN4vMj6IY5Rm9xbCI40K7KLoJWJiI; Path=/; HttpOnly\r\nDate: Wed, 27 Jan 2016 04:54:40 GMT\r\nConnection: close\r\n\r\n"},"msg":"finished","time":"2018-02-27T04:54:40.920Z","v":0}
$ /opt/logstash/bin/logstash -f test.config < data
Settings: Default pipeline workers: 8
Pipeline main started
{
       "message" => "2018-02-27 04:54 +00:00: {\"name\":\"sails\",\"hostname\":\"ip-111-11-11-111\",\"pid\":256213,\"level\":30,\"res\":{\"statusCode\":200,\"header\":\"HTTP/1.1 200 OK\\r\\nX-Powered-By: Sails <sailsjs.org>\\r\\nContent-Type: text/html; charset=utf-8\\r\\nContent-Length: 625\\r\\nETag: W/\\\"271-sZ/4mGWWq9dsamzfgSqtaVpiw\\\"\\r\\nset-cookie: sails.sid=s%3AxAU5sxGlztcbOJBC79xWoUuzNtqkwdasqnX.Js%2FpjHdQst0OVdsankNN4vMj6IY5Rm9xbCI40K7KLoJWJiI; Path=/; HttpOnly\\r\\nDate: Wed, 27 Jan 2016 04:54:40 GMT\\r\\nConnection: close\\r\\n\\r\\n\"},\"msg\":\"finished\",\"time\":\"2018-02-27T04:54:40.920Z\",\"v\":0}",
      "@version" => "1",
    "@timestamp" => "2018-02-28T07:39:36.769Z",
          "host" => "lnxolofon",
          "time" => "2018-02-27T04:54:40.920Z",
          "json" => "{\"name\":\"sails\",\"hostname\":\"ip-111-11-11-111\",\"pid\":256213,\"level\":30,\"res\":{\"statusCode\":200,\"header\":\"HTTP/1.1 200 OK\\r\\nX-Powered-By: Sails <sailsjs.org>\\r\\nContent-Type: text/html; charset=utf-8\\r\\nContent-Length: 625\\r\\nETag: W/\\\"271-sZ/4mGWWq9dsamzfgSqtaVpiw\\\"\\r\\nset-cookie: sails.sid=s%3AxAU5sxGlztcbOJBC79xWoUuzNtqkwdasqnX.Js%2FpjHdQst0OVdsankNN4vMj6IY5Rm9xbCI40K7KLoJWJiI; Path=/; HttpOnly\\r\\nDate: Wed, 27 Jan 2016 04:54:40 GMT\\r\\nConnection: close\\r\\n\\r\\n\"},\"msg\":\"finished\",\"time\":\"2018-02-27T04:54:40.920Z\",\"v\":0}",
          "name" => "sails",
      "hostname" => "ip-111-11-11-111",
           "pid" => 256213,
         "level" => 30,
           "res" => {
        "statusCode" => 200,
            "header" => "HTTP/1.1 200 OK\r\nX-Powered-By: Sails <sailsjs.org>\r\nContent-Type: text/html; charset=utf-8\r\nContent-Length: 625\r\nETag: W/\"271-sZ/4mGWWq9dsamzfgSqtaVpiw\"\r\nset-cookie: sails.sid=s%3AxAU5sxGlztcbOJBC79xWoUuzNtqkwdasqnX.Js%2FpjHdQst0OVdsankNN4vMj6IY5Rm9xbCI40K7KLoJWJiI; Path=/; HttpOnly\r\nDate: Wed, 27 Jan 2016 04:54:40 GMT\r\nConnection: close\r\n\r\n"
    },
           "msg" => "finished",
             "v" => 0
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}

The [res][header] field looks fine.

I have parsed your example trough jsonlint.com and it is valid json (after you strip of 2018-02-27 04:54 +00:00: ). In this case I would pull it trough a json filter after I use dissect to remove the first fields. So something like this:

filter {
    dissect {
      mapping => [ "message", "%{} %{} %{}: %{message}"]
    }
    json {
      source => 'message'
    }
}

I have not tested this and it also depends which version of logstash you are using.

Regards,
Paul.

Sorry guys, mistake on my side, was using GROK debugger (https://grokdebug.herokuapp.com) and it was not giving parsable json but when using logstash it is actually parsable. Thank you

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.