Parsed JSON object/hash requires a target configuration option

I'm seeing these messages in my logstash.log file:

{:timestamp=>"2016-08-30T11:36:51.811000-0700", :message=>"Parsed JSON object/hash requires a target configuration option", :source=>"message", :raw=>"", :level=>:warn}
{:timestamp=>"2016-08-30T11:36:57.453000-0700", :message=>"Parsed JSON object/hash requires a target configuration option", :source=>"message", :raw=>"", :level=>:warn}

According to https://www.elastic.co/guide/en/logstash/current/plugins-filters-json.html the target setting is not required.

So, why am I getting messages saying that the target setting IS required?

I believe these messages are triggered by log lines like:

{"key":"ldap:\/\/1.1.1.1", "foo": "bar"}

jsonlint says:

    jsonlint-py -vf jsontest.json 
    jsontest.json:1:20: Warning: String escape code is not allowed in strict JSON: u'\\/'
    |  At line 1, column 20, offset 20
    |  String started at line 1, column 14, offset 14
     jsontest.json:1:22: Warning: String escape code is not allowed in strict JSON: u'\\/'
    |  At line 1, column 22, offset 22
    |  String started at line 1, column 14, offset 14

So, is logstash erroring on syntax, but logging the wrong message?

logstash is version 2.3.4 on Ubuntu 14.04.

Here's the relevant piece of code:

Judging by the :raw=>"" in the log message I'd say you're trying to parse an empty string.

As in something like:

{"blah":""}

?

No, an empty string. I think your message field was empty.

As in Filebeat is sending an empty line?

What if just the source field was empty? As in, I groked a "jsonfieldfoo" field out of a line that is not just json. But that field doesn't always exist. Would grok create an empty "jsonfieldfootarget" in that case?

For example, for gitlab production logs, I do this:

filebeat prospector:
    - paths:
        - /var/log/gitlab/gitlab-rails/production.log
      encoding: plain
      document_type: gitlab_production
      multiline:
        pattern: ^Started
        negate: true
        match: after

logstash filter:

grok {
           match => ["message", "Started\s%{WORD:verb}\s\"%{NOTSPACE:request}\"\sfor\s%{IPORHOST:clientip}\s(at)\s.*(?<timestamp>%{TIMESTAMP_ISO8601}\s[-+]{1}\d{4})((?<main_message>.*)Parameters:\s(?<project_parameters>\{.*\})(?<main_message_two>.*\n)*(Completed?\s?%{NUMBER:response}(?<response_message>.*))?)?"]
           add_field => {"combinestuff" => "blah" }
         }
         mutate {
           gsub => ["project_parameters", "=>",":"]
         }
         {
         json {
           source => "project_parameters"
           target => "project_parameters"
         }

project_parameters does not always exist. So would the json filter throw that error if that happened?

I did try wrapping the json filter in if [project_parameters]{ }, but did not stop the error messages from showing up.

Also, just to double check, when Filebeat sends a line to Logstash using the plain encoding, the actual log line is in the message field, correct?

As in Filebeat is sending an empty line?

I don't know how your message field is populated so I can't answer, but I suppose that's a possibility.

What if just the source field was empty?

That's exactly what I'm talking about. According to the log message the message field is your source field. This is however not consistent with the configuration you posted.

project_parameters does not always exist. So would the json filter throw that error if that happened?

No, only if the field is empty.

Also, just to double check, when Filebeat sends a line to Logstash using the plain encoding, the actual log line is in the message field, correct?

Yes.

The only other json calls that use message should never have an empty line. They are a { "foo": "bar", ... , "bar": "foo" } object per line. So the filter looks like:

json { source => "message" }

Well, I was trying to avoid posting a bunch of code like this, but here is the full config for the only json logs I have:

if [source] == "/var/log/apache2/access_json.log" {
          json {
            source => "message"
          }
          mutate {
            gsub => [
              "forwardedip","127.0.0.1","",
              "forwardedip",",",""
            ]
            strip => ["forwardedip"]
          }
          mutate { convert => { "bytes" => "integer" } }
          if [forwardedip] !~ /(^127\.0\.0\.1)|(^10\.)|(^172\.1[6-9]\.)|(^172\.2[0-9]\.)|(^172\.3[0-1]\.)|(^192\.168\.)/ {
            if (([clientip] == "-") or ([clientip] == [proxyip])) and ([forwardedip] != "-") {
              mutate {
                replace => { "clientip" => "%{[forwardedip]}" }
              }
            }
          }
          if [proxyip] == [clientip] { mutate { replace => { "proxyip" => "-" } } }
          if [clientip] !~ /(^127\.0\.0\.1)|(^10\.)|(^172\.1[6-9]\.)|(^172\.2[0-9]\.)|(^172\.3[0-1]\.)|(^192\.168\.)/ {
            geoip {
              source => "clientip"
            }
          }
        }


My apache log format:

LogFormat "{\"timestamp\": \"%t\",\"forwardedip\": \"%{X-Forwarded-For}i\",\"clientip\": \"%a\",\"proxyip\": \"%{c}a\",\"serverip\": \"%A\",\"servername\": \"%v\",\"request_log_id\": \"%L\",\"port\": \"%p\",\"ident\": \"%l\",\"auth\": \"%u\",\"verb\": \"%m\",\"request\": \"%U\",\"query\": \"%q\",\"httpversion\": \"%H\",\"response\": \"%>s\",\"bytes\": \"%b\",\"referrer\": \"%{Referer}i\",\"agent\": \"%{User-Agent}i\"}" access_json
if [type] == "ldapwrangler" {
          json {
            source => "message"
          }
          mutate {
            convert => { "timestamp" => "integer" }
          }
          date {
            match => [ "timestamp", "UNIX" ]
            target => "@timestamp"
          }
        }

ldapwrangler logs are a php array encoded to json and written to a text file.


if [type] == "gitlab_production" {
         grok {
           match => ["message", "Started\s%{WORD:verb}\s\"%{NOTSPACE:request}\"\sfor\s%{IPORHOST:clientip}\s(at)\s.*(?<timestamp>%{TIMESTAMP_ISO8601}\s[-+]{1}\d{4})((?<main_message>.*)Parameters:\s(?<project_parameters>\{.*\})(?<main_message_two>.*\n)*(Completed?\s?%{NUMBER:response}(?<response_message>.*))?)?"]
           add_field => {"combinestuff" => "blah" }
         }
         mutate {
           gsub => ["project_parameters", "=>",":"]
         }
         if [project_parameters]{
         json {
           source => "project_parameters"
           target => "project_parameters"
         }}
         if [main_message_two]{
           mutate {
             update => {"combinestuff" => "%{main_message} (project_parameters) %{main_message_two}"}
             remove_field => ["main_message_two", "main_message"]
           }
         }
         if [response_message] and [response] {
           mutate {
             update => {"combinestuff" => "%{combinestuff} Response %{response} %{response_message}"}
             remove_field => ["response_message"]
           }
         }
         if [combinestuff] != "blah"{
           mutate {
             update => {"message" => "%{combinestuff}"}
             remove_field => ["combinestuff"]
           }
         }
         if [combinestuff] == "blah"{
           mutate {
             remove_field => ["combinestuff"]
           }
         }
         date {
           match => [ "timestamp", "YYYY-MM-dd HH:mm:ss Z" ]
           target => "@timestamp"
         }
      }

See my previous post about the gitlab production logs.

i see nothing obviously wrong with your filters, but it all comes down to the inputs. Perhaps adding a stdout { codec => rubydebug } output for events with the _jsonparsefailure tag would be helpful?

Well, this is fun. I added:

      if  "_jsonparsefailure" in [tags] {
        file {
          path => "/var/log/logstash/jsonparsefailure.debug.log"
          codec => "rubydebug"
        }
      }

to my output. That let me know that the ldapwrangler logs are the issue.

The fun part is that I'm tailing both ldapwrangler files while also tailing my debug output. The debug file keeps getting entries added to it, but the actual log files are not adding new logs at the same time.

So, either, filebeat is reading old entries very slowly, or it's just gone crazy....

Ah well, almost time to go home. I'll work on it more tomorrow.

Oh, a sub field in the ldapwrangler logs is called "message". That shouldn't cause any issues, right? It'd just be considered message.message by logstash?

I modified the ldapwrangler logs to use "msg" instead of "message". That has significatnely reduced the number json errors I see.

Is there a way to make the debug codec echo the actual log line? All I'm getting is:

{
       "message" => "",
      "@version" => "1",
    "@timestamp" => "2016-08-31T20:50:11.104Z",
    "input_type" => "log",
         "count" => 1,
          "beat" => {
        "name" => "node01"
    },
        "fields" => nil,
          "tags" => [
        [0] "filebeat",
        [1] "beats_input_codec_plain_applied",
        [2] "_jsonparsefailure"
    ],
        "source" => "/var/www/ldapwrangler/application/logs/cronUserSync.log",
        "offset" => 35208123,
          "type" => "ldapwrangler",
          "host" => "node01"
}
{
       "message" => "",
      "@version" => "1",
    "@timestamp" => "2016-08-31T20:50:11.108Z",
          "type" => "ldapwrangler",
    "input_type" => "log",
         "count" => 1,
        "fields" => nil,
          "tags" => [
        [0] "filebeat",
        [1] "beats_input_codec_plain_applied",
        [2] "_jsonparsefailure"
    ],
        "offset" => 35385677,
          "beat" => {
        "name" => "node01"
    },
        "source" => "/var/www/ldapwrangler/application/logs/cronUserSync.log",
          "host" => "node01"
}

Logstash doesn't maintain the original input separately. If you want to keep it you just have to avoid overwriting it. Or, introduce a clone filter as the very first filter to make a copy of each event and log that somewhere (and make sure you have conditionals in place so that the clones aren't processed by the other filters and outputs).

Hi. Appreciate that this is an old thread, but I'm having the same issue. I've followed the same steps above to log to a file, but all i'm getting is the following in the log file:

{
    "@timestamp" => 2017-03-02T10:44:02.306Z,
      "@version" => "1",
          "tags" => [
        [0] "_jsonparsefailure"
    ]
}

Is there any further way I can find out what the original message is / was and why it's not parsing. We are reading data in from Kafka as opposed to filebeats

Thanks