While parsing XML file getting chunks of message not whole message


(Aashish Chauhan) #1

Hi,

While parsing XML file i'm getting " Only String and Array types are splittable. field:[drug] is of type = NilClass" error. As far as i understand this error is coming because xml parser returning message in small chunk of drug attribute not whole attribute.
My file looks like:
< drugbank>
< drug>
< name>Lepirudin< /name>
< description> Some text < /description>
..
..
570 lines
..
< /drug>
< drug>
< name>Cetuximab< /name>
< description> Some text < /description>
..
..
2740 lines
..
< /drug>
< /drugbank>

if i take a sample document with only two child attribute i.e. name and description then i'm getting correct result and if i'm parsing original document then getting error because parser returning half attribute data in one event and other half in other event.
like this:
{
message: "< drug>< name>Lepirudin< /name>< description> Some text < /description>
..
..
200 lines
..< some-attribute>"
},
{
message: "start with previous half data ..
..
370
..
..< /drug>"
}

My configuration file looks like:
input
{
file {
path => "path/sampledrugbank.xml"
type => "drugbank"
start_position => beginning
sincedb_path => "/dev/null"
codec => multiline
{ pattern => "^<?drugbank.*>"
negate => true
what => "previous"
}}}
filter {
xml {
source => "message"
target => "xmldata"
store_xml => "false"
xpath => [ "/drugbank/drug", "drug"]
}
mutate { remove_field => [ "message", "inxml", "xmldata" ] }
split { field => "[drug]" }
xml {
source => "drug"
store_xml => "false"
xpath => [ "/drug/name/text()", "name"]
xpath => [ "/drug/description/text()", "description"]
}
mutate {replace => { "name" => "%{[name][0]}"
"description" => "%{[description][0]}"
}
}
mutate { remove_field => [ "drug"] }
}
output {stdout{ codec => rubydebug }}

with only two attribute this works fine. If i take original document then code fails and give error and also print some tags in output as below

       "tags" => [
    [0] "multiline",
    [1] "multiline_codec_max_lines_reached",
    [2] "_split_type_failure"
    ]

any suggestion would work.

Thanks


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.