Convert json to xml


#1

hi all,

I wonder if there is a way to convert json to xml. So, in my json data one of the fields has string type. It is actually xml content but represented as string in double quotes. Is there any best practice to convert/extact that field and process as xml?

Cheers


(Magnus Bäck) #2

It's not clear to me exactly what you want to do, but the xml filter can parse XML into Logstash fields.


#3

ok, so, I want to convert the json message into xml so that the parse could work. Currently, in the json schema I have 3 fields: ID - int, XML - string, Modified - date. The issue is that the XML field has type string but as far as I know xml parser expects XML type and not string. This is the reason that I want to convert the string to xml type and only then apply the parsing.


(Magnus Bäck) #4

There is no such thing as an XML type. The xml filter parses strings that contain XML, i.e. exactly what you have.


#5

I'm trying to apply xpath on the message, but somehow it doesn't extract the desired fields. Looks like it ignore what is written in xml part of the config file.


(Magnus Bäck) #6

If you want help you need to provide your configuration and an example event.


#7

ok, so here is the complete json message with 3 fields. I want to extract XML field. You can see the content of XML field as well. I created an example.

{"Schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":true,"field":"xml"},{"type":"int64","field":"modified"}],"optional":false},"payload":{"id":56,"xml":"Xmlns:test\"http://example.com\"<test:data1 type="55"><test:text test:content="222"/></test:data1><test:year test:content="1999"/></test:year>","modified":1507908845773}}

the filter in my config that I'm using is the following:

filter {

      xml {
            store_xml => "false"
            source => "message"
            remove_namespaces => "false"

            xpath =>[
                            "/data1/type/text()","type",

                            "/year/content/text()","year",
                            ]

}
}

So, after applying the config I see the same json message content. Also, is there any possibility in logstash to remove the backslashes which are there for whitespaces in my xml?

Cheers


(Magnus Bäck) #8

You're attempting to parse the message field as XML but it appears the XML is actually in the [payload][xml] field.


#9

is there any possibiltiy to apply a filter on payload?


(Magnus Bäck) #10

It depends on what you want to do but the answer is probably yes.


#11

I acutally want to extract some values from [payload] : [XML]. For instance, the type and year es mentioned in previous example. I assume XPATH won't be able to extract the data directly from payload, so I need other filters perhaps?

Cheers


#12

ok, I'm able to extract now the xml field from payload using the following filter in config file:

    filter {
       json {
        source => "message"
        target => "parsedMain"
    }
    json {
        source => "[parsedMain][payload][xml]"
        target => "parsedContent"
    }
}

So, now I'm trying to use the json output, which is in this case the target "parsedContent", as a source for xml.

   xml {
            store_xml => "false"
            source => "parsedContent"
            remove_namespaces => "false"

            xpath =>[
                            "/data1/type/text()","type",

                            "/year/content/text()","year",
                            ]
}

It doesn't work though. I assume I'm doing something wrong?

Cheers


(Magnus Bäck) #13

Why are you trying to parse [parsedMain][payload][xml] as JSON? That's where the XML data is.


#14

ok, I removed the json part and now parsing the [parsedMain][payload][xml] as XML, bust somehow it ignores the xpath. I mean it doesn't extract the data from XML.


(Magnus Bäck) #15

Perhaps the XPath query doesn't match the data? Or do you need to set remove_namespaces => true?


#16

It doesn't even remove the namespaces.


#17

ok, just to let you know, it is solved. The solution was to use @ instead of text(). So, in my example the xml filter should have been like this:

xml {
            store_xml => "false"
            source => "parsedContent"
            remove_namespaces => "false"

            xpath =>[
                            "/data1/@type","type",

                            "/year/@content","year",
                            ]
}

Cheers


(system) #18

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.