Loading a file having multiple XML elements, but document is not having ROOT element

I am not able to find any post/documents/blogs/search to load details from XML file without root element.
Here is my structure:
rt-logs.gz file is having file containing daily real time messages, and each message have pre-defined structure with Namespaces. Only thing is that the file is just multiple elements, and no root element for whole file.

< tns1:Message xmlns:tns1="...." timestamp=".." messageSize="..." >
< child 1>....< /child 1>
< child n>..< /child n>
< /tns1:Message>
< tns1:Message xmlns:tns1="...." timestamp=".." messageSize="..." >
< child 1>....< /child 1>
< child n>..< /child n>
< /tns1:Message>

How can I parse these files, and get loaded to elastic?

Any one yet on xml file without ROOT element?

By definition, XML documents have exactly one root document. Your file therefore contains multiple documents. I suggest you use a multiline codec to join all the lines of each document into a single event. Perhaps the configuration could look something like this:

codec => multiline {
  pattern => "^<tns1:"
  negate => true
  what => "previous"
}

Thanks Magnus, now problem is I can't get all the attributes and other child nodes populated.

filter{
xml {
source => "message"
store_xml => "false"
target => "Message"
}
mutate {
add_field => { "msg_timestamp" => "%{[Message][timeStamp]}" }
add_field => { "priority" => "%{[Message][priority]}" }
add_field => { "id" => "%{[Message][Identifiers][msgID]}" }
add_field => { "transid" => "%{[Message][Identifiers][TransID]}" }
}
}

I am getting like "transid" => "%{[Message][Identifiers][TransID]}", instead of value.

Even tried to put split, but it gives error for split - need to be String or Array...

What does an event processed by the xml filter look like then? Use a stdout { codec => rubydebug } output.

Hi Magnus,

Similar as what I gave last line, like this:

[WARN ][logstash.filters.split ] Only String and Array types are splittable. field:[Message] is of type = NilClass

{
"path" => "C:/xmlTest/xmlnoroot.xml",
"@timestamp" => 2017-06-21T14:26:38.063Z,
"transid" => "%{[Message][Identifiers][TransID]}",
"@version" => "1",
"host" => "",
"id" => "%{[Message][Identifiers][msgID]}",
"message" => "< Message timeStamp="2016-05-02T03:11:39Z" pr
iority="High" xmlns="........."> < Identifiers >\n < msgID>......< /msgID>\n < TransID>..........< /Tran
sID>\n < /Identifiers>\n< /Message>",
"type" => "xml_log",
"priority" => "%{[Message][priority]}",
"msg_timestamp" => "%{[Message][timeStamp]}",
"tags" => [
[0] "multiline",
[1] "_split_type_failure"
]
}

And If remove split filter, split type failure is gone, but output is same.

If you look at your event you'll notice that there is no Message field to split.

The reason your xml filter does nothing is that haven't set the xpath option and you've disabled store_xml.

Hi Magnus,

I tried xpath options but no luck yet.

xpath =>
["/MessageTraceEvent/priority/text()", "testTs"]

Where's the MessageTraceEvent element in your document? The XML abbreviated sample you posted earlier doesn't contain one.

Typo it's Message.

"priority" isn't a subelement but an attribute, so the correct XPath expression is probably /Message/@priority/text().

Ahh my bad..Thanks for pointing it, I will try that and will let you know.

Regards
AB

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.