Continuosly import XML files from a directory


(Sandro B.) #1

Hello everyone,

I am new to elastic, so please forgive me if my questions are not correctly formed.
I setup a Kibana->Elasticsearch->Logstash to store and analyze logs of a call server, that are sent to logstash via syslog. This works pretty good and, even if i have to figure out how to build useful queries on Kibana, i am ok for now.
This call server produces also Call Details Records, that are basically small xml files, each one describing one call, that the call server drops in a directory when at the hangup.
I would like that logstash contninuosly import this files into ES.
I tried a input file plugin (with a *.xml file pattern) and a

xml {
source => "message"
}

filter... but this seems not working.
Can anyone point me to the right direction?

Thank you in advance,
SB


(Robert Toellner) #2

begin by setting your target within the xml filter ( i.e. target => "my_xml" )


(Magnus Bäck) #3

Are the XML documents spread over multiple lines in each file? That's going to make it a bit tricky. Also, Logstash has no support for deleting or moving a file after all of it has been processed, which sounds like something that would be useful for you.


(Sandro B.) #4

Yes, unfortunately each xml files is "human-readably" formatted that, in this case, i understood it is a problem.
The fact that every XML file has an unique name could be useful to me?

However i understood that logstash is "biased" to work with logfiles, that usually are organized in rows in a single file. I wonder if I am using the right tool for this job, or there is another way to import these files in an elasticsearch db.

Thanks for helping.


(Magnus Bäck) #5

I think you can make good use of Logstash, but perhaps not for the initial XML file parsing. I'd write a small script that reads these files and converts their contents to a format that suits Logstash better. To avoid double-processing of the files they would be deleted or renamed after they've been handed off to Logstash (possibly via another file).


(Sandro B.) #6

Maybe a bash cronjob that removes newlines from the xml file, and append the one-line XML output to a log file ... and make logstash work on that?

Convert the XML doc to a CSV is a lot more work...


(Magnus Bäck) #7

Maybe a bash cronjob that removes newlines from the xml file, and append the one-line XML output to a log file ... and make logstash work on that?

Yes, something like that should be enough.


(system) #8