mandar.raj
(Mandar Betageri)
February 27, 2019, 1:37pm
1
Hello,
I am new to logstash, i have one requirement like below.
i want to create index on xml tag. this xml is present in database table. I am able to index on column which having this xml. but the requirement is to index on particular xml tags.
could you please help me with example
Xml from db table column as below
<?xml version="1.0" encoding="UTF-8"?>
<alert-header>
<elem name="alertDate">2019-01-10 01:56:43</elem>
<elem name="score">100</elem>
<elem name="alertEntityKey">1539912_029_07/01/2018 </elem>
<elem name="partyType">Entity</elem>
<elem name="partyYOB"/>
<elem name="partyBirthLocation"/>
<elem name="ahData">
<elem name="alertDate">2019-01-10 01:56:43</elem>
</elem>
<elem name="ahData">
<elem name="jobID">01-10-2019</elem>
</elem>
<elem name="ahData">
<elem name="jobName">TEST_PID</elem>
</elem>
<elem name="ahData">
<elem name="jobType">LARGEBATCH</elem>
</elem>
<elem name="ahData">
<elem name="score">100</elem>
</elem>
<elem name="ahData">
<elem name="numberOfHits">7</elem>
</elem>
<elem name="ahData">
<elem name="partyKey">1539912_029_07/01/2018</elem>
</elem>
<elem name="ahData">
<elem name="partySourceId"/>
</elem>
<elem name="ahData">
<elem name="partyName">ISIS IN THE ISLAMIC SAHEL</elem>
</elem>
<elem name="ahData">
<elem name="partyLName">ISIS IN THE ISLAMIC SAHEL</elem>
</elem>
<elem name="ahData">
<elem name="partyAliases"/>
</elem>
<elem name="ahData">
<elem name="alertType">Sanctions</elem>
</elem>
<partyIds/>
<elem name="partyNatCountries">
<elem name="countryCd"/>
</elem>
<elem name="partyAddresses">
<elem name="partyAddressLine1"/>
<elem name="partyAddressLine2"/>
<elem name="partyCity"/>
<elem name="partyPostalCd"/>
<elem name="partyStateProvince"/>
<elem name="countryCd"/>
</elem>
</alert-header>
i want to index on jobId, jobName etc...
Badger
February 27, 2019, 2:59pm
2
You can parse the XML using
xml { source => "message" target => "[@metadata][XML]" store_xml => true }
The resulting XML will look like this
"XML" => {
"elem" => [
[ 0] {
"name" => "alertDate",
"content" => "2019-01-10 01:56:43"
},
[ 1] {
"name" => "score",
"content" => "100"
},
[...]
[ 6] {
"name" => "ahData",
"elem" => [
[0] {
"name" => "alertDate",
"content" => "2019-01-10 01:56:43"
}
]
},
You can use a ruby filter to iterate over the array, and if the array entry has name and content fields use them to add a field to the event, and if the array entry has a elem field do the same check on that. Something like this:
ruby {
code => '
event.get("[@metadata][XML][elem]").each { |x|
if x["name"] and x["content"]
event.set(x["name"], x["content"])
else
if x["elem"].kind_of?(Array)
x["elem"].each { |y|
if y["name"] and y["content"]
event.set(y["name"], y["content"])
end
}
end
end
}
'
}
Then you may need special handling for some of the fields, but this should get you started.
mandar.raj
(Mandar Betageri)
February 27, 2019, 3:17pm
3
Thanks for response.
My current logstash-config.conf is as below
input {
jdbc {
#input Configuration
jdbc_connection_string => "jdbc:oracle:thin:@oraasgtd37-scan.nam.nsroot.net :8889/SID"
jdbc_user => "admin"
jdbc_password => "*****"
jdbc_driver_library => "I:\Jars\ojdbc6.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
statement => "select html_file_key from alerts where deleted =0"
#use_column_value => true
#tracking_column => "alert_internal_id"
#schedule => " * * * * *"
}
}
output {
elasticsearch {
#output configuration
hosts => "http://localhost:9200 "
index => "alert_index"
document_type => "alert"
#document_id => "%{alert_internal_id}"
}
stdout{
codec => rubydebug
}
}
what changes required in this to parse the xml.
Note: HTML_FILE_KEY returns the xml
Badger
February 27, 2019, 3:38pm
4
Add
filter {
xml { source => "html_file_key" target => "[@metadata][XML]" store_xml => true }
ruby {
code => '
event.get("[@metadata][XML][elem]").each { |x|
if x["name"] and x["content"]
event.set(x["name"], x["content"])
else
if x["elem"].kind_of?(Array)
x["elem"].each { |y|
if y["name"] and y["content"]
event.set(y["name"], y["content"])
end
}
end
end
}
'
}
}
mandar.raj
(Mandar Betageri)
February 28, 2019, 6:33am
5
I added this filter as it is, but i am getting Error.
[ERROR][logstash.filters.ruby ] Ruby exception occurred: undefined method `each' for nil:NilClass
Please help me out on this, i am unaware of ruby.
Badger
February 28, 2019, 2:03pm
6
I suggest changing the output to be
stdout { codec => rubydebug { metadata => true } }
and see of the XML was successfully parsed to include [@metadata ][XML][elem]
mandar.raj
(Mandar Betageri)
February 28, 2019, 4:23pm
7
Thank you so much!
I am able to parse xml now. But my requirement is to search with name, How i can right the uri to get alertDate or score from this parsed xml. could you please help me on this?
Badger
February 28, 2019, 4:36pm
8
That is what the ruby filter does.
mandar.raj
(Mandar Betageri)
February 28, 2019, 4:50pm
9
But how i get the content with respect to name in elastic search
Badger
March 6, 2019, 12:52pm
11
What exactly do you not like about the events created by the ruby filter. Please show an event and what you want to change in it.
mandar.raj
(Mandar Betageri)
March 6, 2019, 1:06pm
12
Hi Badger, actually i am still struggling to add fields(name=>content) to elastic search. I am not aware of ruby.
mandar.raj
(Mandar Betageri)
March 6, 2019, 2:09pm
13
Is there any other way using XPATH??
system
(system)
Closed
April 3, 2019, 2:10pm
14
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.