XML filter causing ES mapping issues

arisbanach · April 2, 2018, 1:55am

At least, I think that is the problem.

I am getting a lot of rejected docs from Elasticsearch, and when I look at the reason in the dead letter queue file, it says "Can't get text on a START_OBJECT". The only thing I'm doing using the XML filter and then sending to Elasticsearch.

I think what is happening is that at one point in my XML documents, some of them have one level of a text element:

<text>Here is some text.</text>

while others have nested levels:

<text>Here is some text.
  <text> Here is some more text.</text>
</text>

I think that this means the first file's output will cause Elasticsearch to set the mapping for text to be text and then on later docs it will be an object.

Is that correct? If so, how can I handle this issue?

Thank you!

wwalker · April 2, 2018, 2:04am

What's your pipeline config?

arisbanach · April 2, 2018, 2:16am

input{
  s3 {
    bucket => "mybucketname"
    access_key_id => "removed"
    secret_access_key => "removed"
    exclude_pattern => "^((?!XML$).)*$"
    region => "us-east-2"
    sincedb_path => "/etc/logstash/conf.d/.sincedb_files"
    codec => multiline {
      pattern => "<rootElement>"
      negate => true
      what => "previous"
      max_lines => 10000
      max_bytes => "100 MiB"
    }
  }
}

filter {
  xml {
    source => "message"
    target => "message"
    force_array => false
  }
}

output {
  stdout { codec => rubydebug }
  elasticsearch {
    hosts => "removed"
    index => "index_pattern-%{+YYYY.MM.dd}"
  }
}

arisbanach · April 3, 2018, 12:29pm

Anyone have any ideas? I would imagine this would be a common issue if I'm correct about what's happening. No idea how to handle it though.

arisbanach · April 15, 2018, 10:44pm

For anyone else experiencing this: I stopped trying to parse the entire XML file and just went with selecting each section of it with xpaths.

system · May 13, 2018, 10:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Xml filter plugin - creating nested field out of null object: Can't get text on a END_OBJECT Logstash	11	2183	July 6, 2017
Can't merge a non object mapping with an object mapping Logstash	6	2646	July 6, 2017
Logstash XML Parser Logstash	3	691	May 31, 2017
XML Filter Not Working Logstash	2	558	September 8, 2017
Can't get text on a END_OBJECT Elasticsearch	5	1990	July 5, 2017

XML filter causing ES mapping issues

Related topics