Logstash Could not index event to Elasticsearch, mapper_parsing_exception, illegal_argument_exception


(Simon Becker) #1

Hi there,

I try to get a JSON Formatted Datastream from Kafka processed by logstash towards Elasticsearch.
My Json Stream has multiple Arrays filled with objects. All of the arrays except of one are working perfectly fine, but one of these is always failing by throwing the following error:

[2018-09-25T11:09:41,384][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"index_test", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x41f204ec>], :response=>{"index"=>{"_index"=>"index_test", "_type"=>"doc", "_id"=>"iWYXEVx", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [HeaderVariables.Value.VALUE]", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"For input string: \"input_string\""}}}}}

The JSON string is generated by a windows service I run on a server (bought that tool), so the JSON should be in a valid format.

grafik

The other objects are all build with the same structure so I really don't understand why logstash is parsing some and some not.

My logstash pipeline looks like that:

input{
kafka {
   topics => ["topic_test"]
   bootstrap_servers => "localhost:9092"
	}
	}

filter {
  json {
  source => "message"
}

}
output{
        stdout {codec => rubydebug }

		elasticsearch {

                hosts => "localhost:9200"
                index=> "index_test"       
		}

	}

I also tried a bit around with the split command, but with no success. if I remove the HeaderVariables it gets forwarded to elasticsearch and is now viewable in Kibana, but i really need the HeaderVariables.

If someone has an Idea how to fix this, please let me know that would help me out very much.


(Ry Biesemeyer) #2

This error is coming back from Elasticsearch when Logstash attempts to insert the document, and is not generated within Logstash.

I see that you have debug output running to stdout; do you have an example of the event's structure there?


(Simon Becker) #3

Hi yaauie,

thank you for your reply, so I unfortunately can't give you the full event structure, has over 400 lines, but I can show you the event of the values that are failing. The structure of the event you see in my post, this was formatted by the N++ JSON Viewer for an easy overview of the structure.

The JSON looks like this (Before and after are other events I just past here the HeaderVariables Objects that fail):

"HeaderVariables": [{
"Identifier": "1062ea44",
"Name": "eventHeader.ab",
"DataType": "System.UInt32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77aasd",
"Value": {
"IS_NULL": false,
"TYPE": "System.UInt32",
"VALUE": 0
},
"Children": [],
"TestId": "00000000-0000-0000-0000-000000000000",
},
{
"Identifier": "fd9dfef9",
"Name": "eventHeader.xy",
"DataType": "System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77aasd",
"Value": {
"IS_NULL": false,
"TYPE": "System.String",
"VALUE": "1.0"
},
"Children": [],
"TestId": "00000000-0000-0000-0000-000000000000",

}]

The difference between the objects are only the Value Types, these differ often (can be string or int or whatever).
The Error seems like is always created when it tries to parse the Object Values:
"failed to parse [HeaderVariables.Value.VALUE]"

Also Every other seems to get parsed still I cant see it in kibana. If I remove the HeaderVariables with the Mutate --> remove plugin everything gets parsed except of the HeaderVariables of course.


(Ry Biesemeyer) #4

In Elasticsearch, all fields with a given path must be type-consistent with each other -- both within each document and across the entire index.

You may want to transpose this array-of-objects into a single key/value object (which will likely behave more like you're expecting once it gets to Elasticsearch).

I actually wrote something to do just this yesterday, but it doesn't quite work with all the extra metadata in your objects, so I'll tweak it and get back to you.


(Ry Biesemeyer) #5

I'll link to the script at the bottom.

If you want to transpose that array of objects into a single object containing key/value pairs, which will enable you to have different field name per value, you can use the transpose script I am attaching to the end of this post.

With your pasted input, there are two possible ways to use it.

To produce a complex value that includes the "IS_NULL" and "TYPE" metadata:

{
  "HeaderVariables": {
    "eventHeader.ab": {
      "IS_NULL": false,
      "TYPE": "System.UInt32",
      "VALUE": 0
    },
    "eventHeader.xy": {
      "IS_NULL": false,
      "TYPE": "System.String",
      "VALUE": "1.0"
    }
  }
}
filter {
  ruby {
    path => "/path/to/transpose.logstash-filter-ruby.rb"
    script_params => {
      "source" => "HeaderVariables"
      "field_name_key" => "Name"
      "field_value_key" => "Value"
    }
  }
}

OR to include just the simple value:

{
  "HeaderVariables": {
    "eventHeader.ab": 0,
    "eventHeader.xy": "1.0"
  }
}
filter {
  ruby {
    path => "/path/to/transpose.logstash-filter-ruby.rb"
    script_params => {
      "source" => "HeaderVariables"
      "field_name_key" => "Name"
      "field_value_key" => "[Value][VALUE]"
    }
  }
}

Caveat: having dots in key names can have surprising results in Elasticsearch; you may need to further process the data to get it into a shape that is ideal for Elasticsearch.



(Simon Becker) #6

Thank you so much, your suggested solution works perfect!


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.