Logstash flags XML header in log message as _jsonparsefailure?

I've narrowed one of my _jsonparsefailure to an XML header, shown below. The XML is "part" of the log that our application sends out.

2017-02-27 14:01:30,215 INFO  :SimpleThreadPoolWorker-4 [jdbc.sqlonly]  batching 1 statements:
1:  insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_ID, SEQUENCE_ORD) values ('<?xml version="1.0"?>
', '1000h6c14b2m', 0)

This is the JSON file that rsyslog sends out, which parses out correctly on online JSON parsers. I show it here in multiple lines for easy viewing, but it's really a single line event.

{
  "type":"trm-system",
  "host":"ip-10-22-2-188",
  "timestamp":"2017-03-08T20:19:01.234577+00:00",
  "@version":"1",
  "customer":"test",
  "role":"app1",
  "sourcefile":"/tmp/elk/trm-system.log",
  "message":"2017-02-27 14:01:30,215 INFO  :SimpleThreadPoolWorker-4 [jdbc.sqlonly]  batching 1 statements:\n1:  insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_ID, SEQUENCE_ORD) values ('xml version=\"1.0\"\n', '1000h6c14b2m', 0)"
}

I see this in logstash.stdout

{
       "message" => "{ \"type\":\"trm-system\", \"host\":\"ip-10-22-2-188\", \"timestamp\":\"2017-03-08T20:16:38.422895+00:00\", \"@version\":\"1\", \"customer\":\"test\", \"role\":\"app1\", \"sourcefile\":\"/tmp/elk/trm-system.log\", \"message\":\"2017-02-27 14:01:30,215 INFO  :SimpleThreadPoolWorker-4 [jdbc.sqlonly]  batching 1 statements:1:  insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_ID, SEQUENCE_ORD) values ('xml version=\"1.0\"', '1000h6c14b2m', 0)\"}\n",
      "@version" => "1",
    "@timestamp" => "2017-03-08T20:16:38.423Z",
          "host" => "0:0:0:0:0:0:0:1",
          "tags" => [
        [0] "_jsonparsefailure"
    ]
}

This is the logstash error I see in logstash.log

{:timestamp=>"2017-03-08T20:16:39.063000+0000", :message=>"Error parsing json", :source=>"message"
, :raw=>"{ \"type\":\"trm-system\", \"host\":\"ip-10-22-2-188\", \"timestamp\":\"2017-03-08T20:16:
38.422895+00:00\", \"@version\":\"1\", \"customer\":\"test\", \"role\":\"app1\", \"sourcefile\":\"
/tmp/elk/trm-system.log\", \"message\":\"2017-02-27 14:01:30,215 INFO  :SimpleThreadPoolWorker-4 [
jdbc.sqlonly]  batching 1 statements:1:  insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_
ID, SEQUENCE_ORD) values ('xml version=\"1.0\"', '1000h6c14b2m', 0)\"}\n", :exception=>#<LogStash:
:Json::ParserError: Unexpected character ('1' (code 49)): was expecting comma to separate OBJECT e
ntries
 at [Source: [B@2eb78005; line: 1, column: 394]>, :level=>:warn, :file=>"logstash/filters/json.rb"
, :line=>"76", :method=>"filter"}

Why is logstash flagging a valid JSON file as a _jsonparsefailure?

Could you paste configuration you are using ?

Regards,

@RodrigoMontoro thanks for the reply!

Logstash v2.4.1

input {
  udp {
    port => 5555
  }
}

filter {
  json {
    source => "message"
    target => "parsedJson"
  }
  if "_jsonparsefailure" not in [tags] {
    mutate {
      add_field => {
               "type" => "%{[parsedJson][type]}"
           "hostname" => "%{[parsedJson][host]}"
          "timestamp" => "%{[parsedJson][timestamp]}"
           "customer" => "%{[parsedJson][customer]}"
               "role" => "%{[parsedJson][role]}"
         "sourcefile" => "%{[parsedJson][sourcefile]}"
        "log_message" => "%{[parsedJson][message]}"
      }
    }
    if [type] == "trm-error" {
      grok {
          # Grok filters go here
      }
    }
    mutate {
      remove_field => ["parsedJson", "message"]
    }
  }
}

output {
  stdout {
    codec => rubydebug
  }
}

If logs are being send in this format

{
"type":"trm-system",
"host":"ip-10-22-2-188",
"timestamp":"2017-03-08T20:19:01.234577+00:00",
"@version":"1",
"customer":"test",
"role":"app1",
"sourcefile":"/tmp/elk/trm-system.log",
"message":"2017-02-27 14:01:30,215 INFO :SimpleThreadPoolWorker-4 [jdbc.sqlonly] batching 1 statements:\n1: insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_ID, SEQUENCE_ORD) values ('xml version="1.0"\n', '1000h6c14b2m', 0)"
}

You probably need multiline before json filter.

I can't test something now but config like this bellow would help

multiline {
pattern => "}"
negate => "true"
what => "next"
}

json {
source => "message"
target => "parsedJson"
}

Could you test ?

Regards

Sorry, I just showed log event as multiple lines for easy viewing. It's really just one single line, like this. I updated my OP.

{"type":"trm-system", "host":"ip-10-22-2-188", "timestamp":"2017-03-08T20:19:01.234577+00:00", "@version":"1", "customer":"test", "role":"app1", "sourcefile":"/tmp/elk/trm-system.log", "message":"2017-02-27 14:01:30,215 INFO :SimpleThreadPoolWorker-4 [jdbc.sqlonly] batching 1 statements:\n1: insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_ID, SEQUENCE_ORD) values ('xml version=\"1.0\"\n', '1000h6c14b2m', 0)"}

It's suppose to work, tested here quickly.

{
"hostname" => "ip-10-22-2-188",
"@timestamp" => 2017-03-09T15:10:35.656Z,
"role" => "app1",
"@version" => "1",
"host" => "LABS",
"log_message" => "2017-02-27 14:01:30,215 INFO :SimpleThreadPoolWorker-4 [jdbc.sqlonly] batching 1 statements:\n1: insert into CR_TEXT_STORAGE_DATA (TEXT_VAL, TEXT_STORAGE_ID, SEQUENCE_ORD) values ('xml version="1.0"\n', '1000h6c14b2m', 0)",
"type" => "trm-system",
"timestamp" => "2017-03-08T20:19:01.234577+00:00",
"customer" => "test",
"sourcefile" => "/tmp/elk/trm-system.log"
}

What I think it is happening, you need to remove "syslog" part (grok) and so use json filter from there. Could you remove filter and just output raw event ? Because reading file direct seems fine but it's different from something sent by a syslog server.

Regards,

Regards,

??? I don't have a syslog filter, only a json filter. What do you mean by "syslog part (grok)"? Also, did you test in v2.4.1 or v5.2?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.