Please Suggest the config file for the XML data to load in logstash

Please don't post screenshots when you can use normal copy/paste of text.

Use a grok or csv filter to separate timestamp and other stuff from the XML payload, then use an xml filter to process the field containing the XML. The exact look of the configuration depends on what you want the end result to look like.

Thanks for the reply magnus and sorry for the screenshot

The XML payload contains 'request' and 'response' which is "|" sperated

for which i have a config file shown below :

input {
file {
path => "/opt/test5/practice_new/data.xml"
start_position => "beginning"
codec => multiline
{
pattern => "^<'soapenv:Envelope>|<'soapenv:Envelope>"
negate => true
what => "next"
}
}
}
filter {
xml {
store_xml => false
source => "req"
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
index => "req_res"
hosts => ["localhost:9200"]
}
stdout {}
}

In which i have not included timestamp and other stuff , was trying to load 'request' and 'response' data, first in two sperate field.

For config file mentioned above the whole xml data gets loaded in one field.

My output should look like this :

id - 1499871540
timestamp - 2017-07-12 14:59:00.789398
success - 1
host - htintra.net
status - read

request -
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:web="http://webservices.lookup.sdp.bharti.ibm.com">
......
<'/soapenv:Envelope>

response -
<'soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/
envelope/">
......
<'/soapenv:Envelope>

Please reply anyone.

thanks in advance

1 Like

Your multiline configuration is wrong. The logic should be "unless the line begins with something like "1499871540|2007-07-12..." join with the previous line.

Once that's working, look into using a csv filter to parse the line and split on "|" (and pray that the XML documents don't contain such a character).

Thanks for the reply

Using the above pattern ,the data gets loaded successfully.

but i'm not able to split the data on "|".

The configuration file used for loading is below :

> file {
>   path => "/opt/test5/practice_new/data.xml"
>   start_position => "beginning"
>   codec => multiline
>   {
>    pattern => "^1499871540|2017-07-1"
>    negate => true
>    what => "previous"
>   }
>  }
> }
> filter{
> csv {
> separator => "|"
>       columns => ["id","timestamp","success","host","status","request","response"]
> }
> }
>  output{
>  stdout { codec => rubydebug }
>  elasticsearch {
>   index => "req_res"
>   hosts => ["localhost:9200"]
>  }
>  stdout {}
> }

but i'm not able to split the data on "|".

What's the problem? If there's a concern that "|" characters could exist inside the XML documents you could use a grok filter.

We'll save time if you show us the result of your stdout { codec => rubydebug } output.

> "path" => "/opt/test5/practice_new/data.xml",
>     "@timestamp" => 2017-07-19T06:05:07.541Z,
>       "@version" => "1",
>           "host" => "monitor.htintra.net",
>        "message" => "1499871540|2017-07-12 14:59:00.789398|1|abinitiosrv.htintra.net|read| <'soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:web=\"http://webservices.lookup.sdp.bharti.ibm.com\">\n<'soapenv:Header/>\n<'soapenv:Body>\n<'web:getLookUpServiceDetails>\n<'getLookUpService>\n<'serviceRequester>iOBD<'/serviceRequester>\n<'lineOfBusiness>mobility</'lineOfBusiness>\n<'lookupAttribute>\n<'searchAttrValue>911425152231426<'/searchAttrValue>\n<'/lookupAttribute>\n<'/getLookUpService>\n<'/web:getLookUpServiceDetails>\n<'/soapenv:Body>\n<'/soapenv:Envelope> | <'soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\">\n<'soapenv:Body>\n<'ns:getLookUpServiceDetailsResponse xmlns:ns=\"http://webservices.lookup.sdp.bharti.ibm.com\">\n<'getLookUpServiceReturn>\n<'errorInfo>\n<'ErrorCode/>\n<'ErrorMessage/>\n<'/errorInfo>\n<'lookupResponseList>\n<'/soapenv:Envelope>",
>           "tags" => [
>         [0] "multiline",
>         [1] "_csvparsefailure"
>     ]
> }

I tried loading the data by removing the XML data in my file, with the same configuration mentioned above then it successfully split the data according to "|"

but when i add the xml data and try to run, then it gets loaded in kibana but the data does not split on "|".

Does the special characters in my XML data is the reason why it does not get split on "|" ?

There is a "|" character in my XML document which divide the data into request and response fields .

please reply ..

The csv filter can't cope with events with embedded newline characters (see issue below). You can use a grok filter instead or possibly a mutate filter and its split option.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.