Logstash double quote escape

I have a JSON message that is not escaping well due to a double quote after

"<?xml

as below:

[2017-08-21T23:59:59,133][WARN ][logstash.codecs.jsonlines] JSON parse error, original data now in message field {:error=>#<LogStash::Json::ParserError: Unexpected character ('<' (code 60)): was expecting comma to separate OBJECT entries
 at [Source: { "type":"syslog","host":"host1","role":"ingest_log","message":" [HFB] 2017-08-22 00:00:04,260 INFO  [MuleScheduler1_Worker-10] outgoing.Manager (::): Converting Xml into object for xml: "<?xml version="1.0" ...""}; line: 1, column: 209]>, :data=>"{ \"type\":\"syslog\",\"host\":\"host1\",\"role\":\"ingest_log\",\"message\":\" [HFB] 2017-08-22 00:00:04,260 INFO  [MuleScheduler1_Worker-10] outgoing.Manager (::): Converting Xml into object for xml: \"<?xml version=\"1.0\" ...\"\"}"}

I tried the follow filter below.

filter{
 if [role] == "ingest_log" {
   mutate {
      gsub => ["message",'"','\"']}
    grok {
      match => ["message", " INFO  %{TIMESTAMP_ISO8601:timestamp_match} %{GREEDYDATA:message}"]
      match => ["message", ' INFO  %{TIMESTAMP_ISO8601:timestamp_match} %{GREEDYDATA:message}']
      overwrite => ["message"] }
    date {
      match => [ "timestamp_match", "yyyy-MM-dd HH:mm:ss,SSS"]
      target=> "@timestamp"
      remove_field => ["timestamp_match"]}}
}

The mutate doesn't look to do anything, am I doing this incorrectly?

The log you're trying to parse appears to be invalid JSON. That's hard to fix. Please see if you can fix the real problem.

The json_lines codec is run before any filters. If you want to fix the broken JSON using a filter you need to do that before you use a json filter to parse the JSON blob.

Damn. Is there a best practice to do this? I am sending the logs via rsyslog.

here is my template:
$template ls_json,"{ "type":"syslog","host":"%HOSTNAME%","role":"%app-name%","message":" %msg:::json%"}"

If you can't get rsyslog to serialize to JSON properly I suggest you avoid JSON. Since presumably the message is the only "free form" token (i.e. a token that can contain e.g. spaces and quotes) , perhaps you can just do a simple key=value list that's easy to parse on the Logstash side?

1 Like

Thanks, took your suggestion.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.