Your new example XML has six lines. A file input like
file { path => "/home/user/foo.txt" sincedb_path => "/dev/null" start_position => beginning }
will consume that as six separate events. For the second one:
<AV9APIDATA xmlns="av9api-platform-com">
the xml filter with complain ":exception=>#<REXML::ParseException: No close tag for /AV9APIDATA". That's because the closing /AV9APIDATA tag is in the sixth event, not the second.
You need to use a multiline codec to consume the entire XML document as a single event. For example, if you need to consume the entire file as one event you could use
file {
path => "/home/user/foo.txt"
sincedb_path => "/dev/null"
start_position => beginning
codec => multiline {
pattern => "^Spalanzani"
negate => true
what => previous
auto_flush_interval => 2
}
}
If you do that then the xml filter will parse it just fine.
Note, if you have two XML documents in a file, for example
<?xml version="1.0" encoding="utf-16"?>
<AV9APIDATA xmlns="av9api-platform-com"> <ORDER EngineID="2"> </ORDER>
</AV9APIDATA>
<?xml version="1.0" encoding="utf-16"?>
<AV9APIDATA xmlns="av9api-platform-com"> <ORDER EngineID="3"> </ORDER>
</AV9APIDATA>
then you will get a different exception: attempted adding second root element to document.
In that case, use a different pattern to consume documents
codec => multiline {
pattern => "^</"
negate => true
what => next # Note previous changed to next
auto_flush_interval => 2
}
That will work provided that your XML is pretty-printed with indentation. If you have nested elements that are left aligned then it will break and you may have to resort to something like
codec => multiline {
pattern => "^</AV9APIDATA"
negate => true
what => next # Note previous changed to next
auto_flush_interval => 2
}
which provides very little flexibility.