Hello again, I am trying to parse an XML string received on UDP input plugin with the XML filter plugin. The structure of the XML is:
<ROOT>
<LEVEL1a>string
<LEVEL2a>
<LEVEL3a>string</LEVEL3a>
<LEVEL3b>string</LEVEL3b>
<LEVEL3c>string</LEVEL3c>
<LEVEL3d>string</LEVEL3d>
</LEVEL2a>
<LEVEL1b>string</LEVEL1f>
<LEVEL1c>string</LEVEL1e>
<LEVEL1d>string</LEVEL1d>
<LEVEL1e>string</LEVEL1c>
<LEVEL1f>string</LEVEL1b>
</ROOT>
It works perfectly converting everything in a json structure, and in case of LEVEL1a it creates an hash:
{
LEVEL1a: {
content => "string",
LEVEL2a => [
LEVEL3a: "string"
LEVEL3b: "string"
LEVEL3c: "string"
LEVEL3d: "string"
],
LEVEL1b: "string",
LEVEL1c: "string",
LEVEL1d: "string",
LEVEL1e: "string",
LEVEL1f: "string"
}
I would need that the Level1a is maintained as a string in this way:
{
LEVEL1a: "string
<LEVEL2a>
<LEVEL3a>string</LEVEL3a>
<LEVEL3b>string</LEVEL3b>
<LEVEL3c>string</LEVEL3c>
<LEVEL3d>string</LEVEL3d>
</LEVEL2a>",
LEVEL1b: "string",
LEVEL1c: "string",
LEVEL1d: "string",
LEVEL1e: "string",
LEVEL1f: "string"
}
I didn't find a way to leave all level other than Level1 as the original XML string. Is it even possible?
Topic corrected, to be precise, I am using the XML filter plugin tp convert the XML to JSON.
XML structure corrected.
Just to be precise, I need all LEVEL1 to remain in the original content, not just LEVEL1a.
Your XML is not valid. Nothing closes the LEVEL1a element, and the order of the opening and closing tags for the other LEVEL1 elements are wrong.
That said, you cannot tell the xml filter not to parse some of the XML. You could do something ugly to mangle the XML
mutate { gsub => [ "message", "<(/?)LEVEL([2-9])", "xxx\1\2" ] }
xml { source => "message" target => "theXML" force_array => false }
mutate { gsub => [ "[theXML][LEVEL1a]", "xxx(/?)([2-9])", "<\1LEVEL\2" ] }
or you might be able to use a ruby filter to re-encode LEVEL1a into XML after parsing it.
@Badger , you are right, I have built the XML manually :(, but I hope you got the idea.
The problem is that the fields are not named LEVEL...
What about wrapping the field into a CDATA[""] before to parse? do you know if it will convert it anyway?
In principal it will work if you can produce a regexp that matches the right element names.
mutate { gsub => [ "message", "(<LEVEL1.>)", "\1<![CDATA[", "message", "(</LEVEL1.>)", "]]>\1" ] }
xml { source => "message" target => "theXML" force_array => false }
changes the message to
"message" => "<ROOT><LEVEL1a><![CDATA[string<LEVEL2a><LEVEL3a>string</LEVEL3a><LEVEL3b>string</LEVEL3b><LEVEL3c>string</LEVEL3c><LEVEL3d>string</LEVEL3d></LEVEL2a>]]></LEVEL1a><LEVEL1b><![CDATA[string]]></LEVEL1b><LEVEL1c><![CDATA[string]]></LEVEL1c><LEVEL1d><![CDATA[string]]></LEVEL1d><LEVEL1e><![CDATA[string]]></LEVEL1e><LEVEL1f><![CDATA[string]]></LEVEL1f></ROOT>",
and it get parsed as
"theXML" => {
"LEVEL1f" => "string",
"LEVEL1c" => "string",
"LEVEL1a" => "string<LEVEL2a><LEVEL3a>string</LEVEL3a><LEVEL3b>string</LEVEL3b><LEVEL3c>string</LEVEL3c><LEVEL3d>string</LEVEL3d></LEVEL2a>",
"LEVEL1e" => "string",
"LEVEL1b" => "string",
"LEVEL1d" => "string"
},
Yes, I found the same solution, I have done a:
mutate { gsub => [ message, "<LEVEL1a>", "<LEVEL1a><![CDATA[" ] }
mutate { gsub => [ message, "</LEVEL1a>", "]]></LEVEL1a>" ] }
1 Like