Error indexing using xml plugin

Hi,

logstash 6.8.3
ES 6.3

I am getting constant errors in my logstash log about mapping an xml field:
Mar 16 13:09:04 ip-10-152-3-140 logstash[16367]: [2020-03-16T13:09:04,614][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-xxx-2020.12", :_type=>"doc", :_routing=>nil}, #LogStash::Event:0x2297c725], :response=>{"index"=>{"_index"=>"filebeat-xxx-2020.12", "_type"=>"doc", "_id"=>"pjZ243AB5TP9fEZrR9XM", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [xml_content.Request.TranslatedMessage] tried to parse field [null] as object, but found a concrete value"}}}}

I looked at the mapping for this index:

When I checked another index mapping I can see it is set as this:
"xml_content": {
"properties": {
"AgentID": {
"Duration": {
"EndpointAddress": {
"ErrorDetails": {
"MessageName": {
"Request": {
"RawMessage": {
"Time": {
"TranslatedMessage": {
"type": "keyword",
"ignore_above": 1024
}
}
},

We are parsing api logs the contain information each user is exchanging so I was using dynamic templates because the format of the messages can change between calls. I am wondering if the first record into the index is creating the mapping and then subsequent records that don't hit the same format are being rejected?

I am using the xml plugin because there are some fields we want to capture and work with in logstash like:
xml {
source => "message"
target => "xml_content"
}
grok {
match => { "xml_content[Duration]" => "%{HOUR:duration_hours}:%{HOUR:duration_mins}:%{HOUR:duration_secs}%{GREEDYDATA:duration_milli}" }
}

The only reason I really noticed all these errors is because users were complaining not all the records seemed to be making it through to ES. I am stuck as how to be able to get ALL the information in to ES now. Help?...:slight_smile:
Fiona

Your mapping says that [xml_content][Request][TranslatedMessage] should be an object containing multiple fields. In elasticsearch a field can be either an object or a value, it cannot be an object in some documents and a value in others. So any events where [xml_content][Request][TranslatedMessage] is a concrete value rather than an object will get rejected. The solution is to detect that it is a value and rename it so that the value is field within it instead.

Thanks for that! I don't have any good examples to test with, I am wondering if I can just drop that whole field in Logstash as I don't think we use the contents anywhere. Support just want to get all messages in Kibana so they can perform searches on the original message field anyway.

Do you think I can drop the field?

I tried adding:
mutate {
remove_field => [ "xml_content.Response.TranslatedMessage", "Request.TranslatedMessage", "TranslatedMessage" ]
}

But I still see the errors coming out. Don't understand why it is still trying to map the field if told it to remove it?

My logstash code looks like:

} else if "incoming" in [tags] {
     mutate {
       add_tag => "integration_incoming"
       add_field => { "cust_name" => "%{[fields][customer]}" }
       add_field => { "Index_Type" => "%{[@metadata][beat]}" }
     }
     xml {
       source => "message"
       target => "xml_content"
       suppress_empty => true
     }
     mutate {
       remove_field => [ "xml_content.Response.TranslatedMessage", "Request.TranslatedMessage", "TranslatedMessage" ]
     }
     grok {
       match => { "xml_content[Duration]" => "%{HOUR:duration_hours}:%{HOUR:duration_mins}:%{HOUR:duration_secs}%{GREEDYDATA:duration_milli}" }
     }

I keep the mutate out of the xml filter, but it is still going through.

In elasticsearch a nested field is referenced using periods in the name, such as xml_content.Request.TranslatedMessage, whilst in logstash that would be referred to as [xml_content][Request][TranslatedMessage]

Hah thanks again!

I changed logstash thus:

      xml {
        source => "message"
        target => "xml_content"
        suppress_empty => true
      }
      mutate {
        remove_field => [ "[xml_content][Response][TranslatedMessage]", "[xml_content][Request][TranslatedMessage]" ]
      }

But after restarting logstash I am still seeing these errors:

Mar 16 18:52:09 ip-10-152-3-140 logstash[17984]: [2020-03-16T18:52:09,406][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-xxx-2020.12", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x46851a47>], :response=>{"index"=>{"_index"=>"filebeat-xxx-2020.12", "_type"=>"doc", "_id"=>"jMKw5HAB5TP9fEZrYU0j", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [xml_content.Request.TranslatedMessage] tried to parse field [null] as object, but found a concrete value"}}}}
Mar 16 18:52:09 ip-10-152-3-140 logstash[17984]: [2020-03-16T18:52:09,407][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-xxx-2020.12", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x6ddc8540>], :response=>{"index"=>{"_index"=>"filebeat-xxx-2020.12", "_type"=>"doc", "_id"=>"nMKw5HAB5TP9fEZrYU0j", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [xml_content.Request.TranslatedMessage] tried to parse field [null] as object, but found a concrete value"}}}}

I thought removing the field would fix the issue, but it doesn't seem to be making a difference.

Oh I think I fixed it! Bloody hell, I amaze myself when I get something working half the time :slight_smile:

Changed the line to remove the whole Response/Request fields and that seems to have stopped the errors. The data is still part of the message so the user can still search for it.

      mutate {
        remove_field => [ "[xml_content][Response][TranslatedMessage]", "[xml_content][Request][TranslatedMessage]", "[xml_content][Request]", "[xml_content][Response]" ]
      }