How to load xml file using logstash in es?


(Shankarananth) #1

Hi ,

I'm very new to elk.
Kindly find my data sample in attachment.
How can i load that xml data using logstash.
I used the below mentioned code to load it.

input {
stdin {
type => "stdin-type"
}

file {
type => "file"
path => "D:\splunk_shankar\02-02-2016\data2"
}
}
output {
elasticsearch {
action => "index"
protocol => "http"
host => "localhost"
index => "dataloadxmltest"

}
stdout {}

}


(Magnus Bäck) #2

Is the XML snippet on a line of its own? And all other lines should be ignored?


(Shankarananth) #3

Hi magnusbaeck,

Can you make it somewhat clear.
I'm not getting your point.
what i need to add in filter path, to extract data from the above image.

I need data as mentioned below:

MessageId | date_month | date_second | date_wday | timestartpos
H2H_SEND_MONEY_STORE_WGHHTWI800T_NNCI6UACJ01961_2016-02-02-053758628319 | february | 0 | tuesday | 123
Regards,
Shankar


(Magnus Bäck) #4

Oh, you've edited the question.

The difficulty here isn't parsing the XML itself, that's very easily done with an xml filter. The hard part is getting the whole XML file into a single event. You need to set start_position => beginning for the file input and you need a multiline codec for joining the lines of the file.

This has been brought up here several times in the past so please search the archives.


(Shankarananth) #5

Hi magnusbaeck,

Thank you very much.
I will go through the archives.

Thanks,
Shankar


(Shankarananth) #6

Hi Magnusbaeck,

Going through the archive.
I have framed the below mentioned query, but some time its running in logstash and sometime its not running .
Can u kindly check and point out me what is the Error in the code.

input{
file {
path => "D:\splunk_shankar\02-02-2016\data22.xml"
start_position => beginning
}
}

filter {
          multiline {
                    pattern => "^\s|</result>|^[A-Za-z0-1].*"
                    what => "previous"
            }
            xml {
                    store_xml => "false"
                    source => "message"
                    target => "doc"
                    xpath => [
                            "/result/@field/@value/@text", "result_text",
                            "/result/@field/@value/@text", "result_text",
                            "/result/@field/@value/@text", "result_text",
                            "/result/@field/@value/@text", "result_text",
                            "/result/@field/@value/@text", "result_text",
                            "/result/@field/@value/@text", "result_text"
                    ]
            }

}

output {
elasticsearch {
action => "index"
protocol => "http"
host => "localhost"
index => "showme"

}
stdout {}

}


(Magnus Bäck) #7

That's most likely related to Logstash's tracking of the current position in a file via the sincedb file. start_position => beginning only matters the first time a particular file is seen. You can set sincedb_path => "/dev/null" to disable this feature.


(system) #8