I'm using the xml logstash filter to parse xml values into json, but I'm having issues indexing them into Elasticsearch due to an issue where a null string gets parsed into an empty object.
Source String Snip:
<Actions Context="Author">
<Exec>
<Command>%localappdata%\Microsoft\OneDrive\OneDriveStandaloneUpdater.exe</Command>
<Arguments />
</Exec>
</Actions>
Argument here is a string when populated but the xml filter plugin treats this an an empty object. When sending this data to elastic search the ignore_malformed
index mapping setting does not work to ignore objects when the type expected is a string / keyword.
My current work around is to strip out nested empty objects from my parsed xml with some ruby code. I can't really figure out how to make this a function that can be called to loop through all the key value pairs in the hash so if anyone has some advice on how to better write this code, it would be greatly appreciated.
Ruby Code:
xml = event.get("[winlog][event_data][TaskContentXml]")
key_path = "[winlog][event_data][TaskContentXml]"
xml.each do |key,value|
key_path_1 = key_path + "[" + key + "]"
## loop 1
if !value.nil? && value.is_a?(Hash)
if value.empty?
logger.warn("TaskContentXml1: Empty Hash value at key: " + key_path_1)
event.remove(key_path_1)
else
## loop 2
value.each do |key,value|
key_path_2 = key_path_1 + "[" + key + "]"
if !value.nil? && value.is_a?(Hash)
if value.empty?
logger.warn("TaskContentXml2: Empty Hash value at key: " + key_path_2)
event.remove(key_path_2)
else
## loop 3
value.each do |key,value|
key_path_3 = key_path_2 + "[" + key + "]"
if !value.nil? && value.is_a?(Hash)
if value.empty?
logger.warn("TaskContentXml3: Empty Hash value at key: " + key_path_3)
event.remove(key_path_3)
else
## loop 4
value.each do |key,value|
key_path_4 = key_path_3 + "[" + key + "]"
if !value.nil? && value.is_a?(Hash)
if value.empty?
logger.warn("TaskContentXml4: Empty Hash value at key: " + key_path_4)
event.remove(key_path_4)
else
## loop again
end
end
end
end
end
end
end
end
end
end
end
end
Related issue: Invalid mapping case not handled by index.mapping.ignore_malformed · Issue #12366 · elastic/elasticsearch · GitHub