Basic nested JSON parse failure

Hey, I am new to log stash and surely I must be doing someone wrong. When I sending basic nested json from python 3 it fails:

Message=json.dumps({'default': 'default', 'time': 'again', 'this': {'go': 'by'}})

My log stash config is:

input {
        tcp {
                port => 5000
        }
        sqs {
                queue => "QCB_logstash"
                region => "us-east-1"
        }
}

output {
        elasticsearch {
                hosts => "elasticsearch:9200"
        }
}

Super simple, but it fails with:

{:timestamp=>"2016-02-04T22:54:03.980000+0000", :message=>"Failed action. ", :status=>400, :action=>["index", {:_id=>nil, :_index=>"logstash-2016.02.04", :_type=>"logs", :_routing=>nil}, #<LogStash::Event:0x7d705a75 @metadata_accessors=#<LogStash::Util::Accessors:0x18e4e6d2 @store={}, @lut={}>, @cancelled=false, @data={"default"=>"default", "this"=>{"go"=>"by"}, "time"=>"again", "@version"=>"1", "@timestamp"=>"2016-02-04T22:54:03.533Z"}, @metadata={}, @accessors=#<LogStash::Util::Accessors:0x34c898d2 @store={"default"=>"default", "this"=>{"go"=>"by"}, "time"=>"again", "@version"=>"1", "@timestamp"=>"2016-02-04T22:54:03.533Z"}, @lut={"type"=>[{"default"=>"default", "this"=>{"go"=>"by"}, "time"=>"again", "@version"=>"1", "@timestamp"=>"2016-02-04T22:54:03.533Z"}, "type"]}>>], :response=>{"create"=>{"_index"=>"logstash-2016.02.04", "_type"=>"logs", "_id"=>"AVKufhLIk85JQl8a5Oze", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [this]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"unknown property [go]"}}}}, :level=>:warn}

This seems so simple but I've tried a bunch of configs and can't get a nested structure to work. Any help would be great, thanks

Let's establish exactly what Logstash sees. Replace the elasticsearch output with stdout { codec => rubydebug } and show the results that you get to stdout.

Hey Magnus,

that was exactly where I was heading. Heres the output:

{
          "time" => "again",
       "default" => "default",
          "this" => {
         "go" => "by"
    },
      "@version" => "1",
     "@timestamp" => "2016-02-04T23:34:21.157Z"
 }

Okay, that looks good. What does the index's mapping look like? I suspect that this has been mapped as some data type and how you're suddenly sending something else, hence mapping_parsing_exception.

If your are referring to the index in ES, it doesn't appear to be indexed. If I change

"this" => { "go" => "by"}

to

json.dumps( "this" => { "go" => "by"})

It will make it to ES and Kibana looks like:

I don't understand what you mean -- you were using json.dumps() all along weren't you? I was referring to the mapping of the index, and the error message indicates that the current mapping isn't compatible with the document you were trying to index.

Hm, I guess I don't understand what index you are referring to or how to access that. In my example I was using a second nested json.dumps for the nested structure. Sorry for the confusion. i.e.
Message=json.dumps({'default': 'default', 'time': 'again', json.dumps('this': {'go': 'by'})})

Your original Python code was fine. Your second attempt with a second json.dump() doesn't make sense and results in, as you've seen, {"go": "by"}` being indexed as a string which I can't imagine is what you want.

The index I'm talking about is the Elasticsearch index, i.e. the datastore. It's similar to a table in a relational database. An ES index has one or more fields, and fields have mappings which are similar to the data types of the columns (fields) in a relational database table. As opposed to a relational database the mapping of a field isn't necessarily declared beforehand but detected by ES based on the input it receives.

My guess is that the this field at some point was mapped as a string since the first document you sent into the index had a string value in that field. Now you're trying to index another document where the this field isn't a string but an object (a dictionary in Python-land) containing a string field. Such an object can't be stored into a string field, hence the mapper_parsing_exception error.

Please consult the ES documentation to get the full story on how this works. You can use the get mapping API to obtain the current mappings of an index and thereby check how the this field actually is mapped.

Mappings of fields can't be changed after the fact. The data needs to be reindexed.

Awesome deleting the indexes worked, I thought you were referring to that index. I guess my confusion arises that when I am not sending the output to ES, but rather just stdout with the codec set to ruby debug and it fails I assumed that was not connected to the ES indexes.

Either way, thanks for the help!

I have seen this before, but I'm still confused about the statement:

"Mappings of fields can't be changed after the fact. The data needs to be reindexed."

I have tried changing mapping, but it doesn't work, as stated above, but how would I go about re-indexing existing data, without deleting the index?

In short, errors are human, and that happens. If somehow the type is wrong once, the whole index is screwed?
I don't mind re-indexing if that means I can re-index existing data, but I do mind losing the data!

Can someone explain how this works in real life?
Thanks

There is no "reindexing" the way you're thinking. There is exporting it & importing it into a new index with a template defining the new mappings.

If I have a nested JSON object do I need to define all layers as "object"?