input {
jdbc {
jdbc_driver_library => "/home/brucewayne/software/sqljdbc_4.2/enu/jre8/sqljdbc42.jar"
jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_connection_string => "jdbc:sqlserver://wayneenterprises:9999;databaseName=jokerfiles"
jdbc_user => "bruce"
jdbc_password => "rachel"
statement => "SELECT t.ID AS id,t.ResultXML AS resultXML FROM blah blah"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
# IF you want to add Filter you can add one
filter {
xml {
source => "%{resultXML}"
target => "parsed"
}
#split {
# field => "parsed[SingleResult]"
#}
}
output {
elasticsearch {
hosts => "myelastichost:9292"
index => "testdatabase"
document_id => "%{id}"
document_type => "demo"
manage_template => true
}
stdout { codec => rubydebug }
}
While data does get populated but I think I have made some goof up on the xml part.
If I look at data in json format in kibana then I see that the whole xml appear as a string under the _source against the key resultxml. My understanding is that the xml filter takes a field that contains XML and expands it into an actual datastructure. What I have here is the whole xml in plain string format.
Where's the parsed field where the parsed XML should've been stored? Where's the id field that your jdbc input is also creating alongside resultXML? Are you really using the configuration you've posted?
Yes, but there was no id field in the Kibana screenshot you posted.
This is what I see in logstash output.
Yeah, there's no sign of the xml filter running at all. If the parsing fails it should add a tag. Reproducing your configuration but with another input works fine:
Does it matter how the xml comes? Straight from the field it comes as a single string without any line breaks or any formatting whatsoever when I use jdbc input.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.