Add Execption for special character from json log

Hey Guys,

I have been having a few frustrating days trying to crack this.

I please need a solution either to fix the Ruby code or another method to add an exception for the escape character which is '\\x'.

Logstash filter:

filter {
if [type] == "apache" {
json {
source => "message"
}
ruby {
code => "event['message'] = event['message'].gsub('\\x','blah')"
}
}
}

Example of a Logstash error log:

{:timestamp=>"2016-06-29T14:02:16.547000+0100", :message=>"Error parsing json", :source=>"message", :raw=>"{ "@version": "1", "@timestamp": "2016-06-29T02:36:49.000+0100", "message": "GET /meizu/q-sa%C4%BAe/?page=47 HTTP/1.1", "via": "123.123.123.123", "client-ip": "12.12.12.12", "remote-logname": "-", "remote-user": "-", "recv-time": "[29/Jun/2016:02:36:49 +0100]", "serve-time-microsec": "102880", "request": "GET /meizu/q-sa%C4%BAe/?page=47 HTTP/1.1", "status": "200", "size": "191150", "referer": "-", "user-agent": "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)", "url": "/meizu/q-sa\xc4\\xbae/", "query": "?page=47", "method": "GET", "protocol": "HTTP/1.1", "vhost": "www.site.com" }", :exception=>#<LogStash::Json::ParserError: Unrecognized character escape 'x' (code 120)

Example an Apache Log:

{ "@version": "1", "@timestamp": "2016-06-29T00:58:20.000+0100", "message": "GET /steel/q-sa%C4%BAe/?page=337 HTTP/1.1", "via": "123.123.123.123", "client-ip": "12.12.12.12", "remote-logname": "-", "remote-user": "-", "recv-time": "[29/Jun/2016:00:58:20 +0100]", "serve-time-microsec": "405295", "request": "GET /steel/q-sa%C4%BAe/?page=337 HTTP/1.1", "status": "200", "size": "196565", "referer": "-", "user-agent": "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)", "url": "/steel/q-sa\xc4\xbae/", "query": "?page=337", "method": "GET", "protocol": "HTTP/1.1", "vhost": "www.site.com" }

Any help would be greatly appreciated !

The problem is that those escape characters are not valid JSON. There may be some work done to help when these conditions occur, but there is no out of the box fix for these situations right now.

You might try using a gsub before running the json filter, as filters are applied in the order in which they appear.

I have tried the following and then I dont get any output. As soon as I remove the ruby code the logs display on Kibana or the log file.

filter {
  if [type] == "apache" {

    ruby {
        code => " if event['message']
                  event['message'] = event['message'].gsub('\x','Xx')
                  event['message'] = event['message'].gsub('\\x','XXx')
                  end
                "
            }

    json {
          source => "message"
        }

 }
}

I just tested you filter like this:

##Test.conf
input {
file {
path => [ "/root/input.txt"]
start_position => "beginning"
}
}

filter {
ruby {
code => " if event['message']
event['message'] = event['message'].gsub('\x','Xx')
event['message'] = event['message'].gsub('\x','XXx')
end
"
}

json {
      source => "message"
    }

}

output {
stdout { codec => rubydebug }
}

And the output seems ok

##Output

root@loganalyse3:~# /opt/logstash/bin/logstash -f test.conf
Settings: Default pipeline workers: 4
Pipeline main started
{
"message" => "GET /steel/q-sa%C4%BAe/?page=337 HTTP/1.1",
"@version" => "1",
"@timestamp" => "2016-06-28T23:58:20.000Z",
"path" => "/root/input.txt",
"host" => "loganalyse3",
"via" => "123.123.123.123",
"client-ip" => "12.12.12.12",
"remote-logname" => "-",
"remote-user" => "-",
"recv-time" => "[29/Jun/2016:00:58:20 +0100]",
"serve-time-microsec" => "405295",
"request" => "GET /steel/q-sa%C4%BAe/?page=337 HTTP/1.1",
"status" => "200",
"size" => "196565",
"referer" => "-",
"user-agent" => "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
"url" => "/steel/q-saXxc4Xxbae/",
"query" => "?page=337",
"method" => "GET",
"protocol" => "HTTP/1.1",
"vhost" => "www.site.com"
}

It is working thank you very much !!
I had to wait a few minutes for my logs to populate in Kibana!

This method does not work. I reformatted my logs to standard combined apache logs.