Removing fields with certain value (or if mapping is not found) from the output

(PB) #1

Hello Guys,

I have a Logstash pipeline, responsible for pulling xml files. Each xml is split as suggested here by @Badger : Turning one .xml into multiple events where not all values are always filled
Each field that has been split is mapped to a separate field within mutate, for example:

add_field => { "TEST_ID" => "%{[theXML][PRODUCT][0][GROUP][TEST][ID]}" }
add_field => { "TEST_NAME" => "%{[theXML][PRODUCT][0][GROUP][TEST][NAME]}" }
add_field => { "TEST_VALUE" => "%{[theXML][PRODUCT][0][GROUP][TEST][VALUE]}" }
add_field => { "TEST_STATUS" => "%{[theXML][PRODUCT][0][GROUP][TEST][STATUS]}" }

Unfortunately, each of those fields is optional, which means that they are not always there. In that case i end up with something like this in my index:

"TEST_ID": "1"
"TEST_NAME": "%{[theXML][PRODUCT][0][GROUP][TEST][NAME]}"
"TEST_VALUE": "102.66"
"TEST_STATUS" => "%{[theXML][PRODUCT][0][GROUP][TEST][STATUS]}"

Is it possible to set a rule saying that if a field starts with %{[theXML] or is not found in the source, it should be removed from the output? How could I handle this in the most efficient way?

Thanks in advance!

#2
ruby {
    code => '
        event.to_hash.each { |k, v|
            if v.to_s.start_with?("%{[theXML]")
                event.remove(k)
            end
        }
    '
}
Mutate add_field gives sprint f format for nil values. How to solve this?
(PB) #3

Thank you @Badger. Seems to be working.
One more thing, is it possible to do the same with empty fields (those with value: "") within the same ruby script? I'm not familiar with ruby at all, hence the question.

#4

Yes, you could just change the test to

 if v == "" or v.to_s.start_with?("%{[theXML]")
(PB) #5

Perfect, thank you.

(PB) #6

@Badger, slightly less related question. Is it possible to run ruby script to remove files based on a given string/regex? I'm asking because I would like to remove all the fields left over from initial xml split, so all [theXML] fields, including nested ones. There are plenty of those.
I can filter them out using Kibana's Source Filters, but i think it would be more effective to not send them at all. I tried to use prune but it removes all the fields before mapping so i cannot use them.
Thanks in advance.

#7

If you want to remove [theXML] and all its sub-fields then just

mutate { remove_field => [ "theXML" ] }

Alternatively, when you are creating fields that you know you are going to remove later, just make them subfields of [@metadata]

(PB) #8

@Badger

I cannot thank you enough. I owe you more than one bear at this point :slight_smile: