Using kv to produce fields with every word capitalized

I am using kv to parse a certain log that contains comma-separated key-value pairs. However, the keys may contain multiple space-separated words with inconsistent capitalization, and I'm trying to bring the resulting field names to a defined naming convention - no spaces, every word capitalized, i.e.:

Input:
Field One: value 1, Field two: value 2, Field three: "value 3"
Output:

"FieldOne": "value 1",
"FieldTwo": "value 2",
"FieldThree": "value 3"

Right now, I remove the spaces in the field names using kv, and then I'm left with inconsistently-capitalized field names, and whatever does not fit the convention, I rename using mutate:

    kv {
        source => "message"
        target => "processed_message"

        field_split => ","
        value_split_pattern => ": "
        # Most keys already capitalize every word, which is the chosen naming convention;
        # "capitalize" only works for the first word and breaks the rest.
        #transform_key => "capitalize"
        remove_char_key => "- "
        trim_value => "\""
    }

    mutate {
        rename => { ... }
    }

I've been trying to find a universal solution for this case, since new fields may appear, and I don't want to amend the config every time. Is there a right way to do this with available plugins, or can it only be done with Ruby code?

Yes, I think that will require a ruby filter.

Dear future people,

This is the code that I came up with. Works well so far. Feel free to use, but do mind that I do not know Ruby at all.

filter {
    # Parse out comma-separated key-value pairs.
    # Commas are escaped by quoting the string; kv seems to handle this if trim_value is set, but this is not documented.
    # Space removal and capitalization is handled by ruby code below.
    kv {
        source => "message"
        target => "processed_message"

        field_split => ", "
        value_split_pattern => ": "
        trim_value => "\""

        remove_field => [ "message" ]
    }

    # Enforce field naming convention for the target container:
    #  Alphanumeric characters only, every word capitalized.
    ruby {
        code => "
            # Only work on target container fields
            target_container = '[processed_message]'

            keys = event.get(target_container).to_hash.keys
            keys.each { |key|
                # Split on all non-alphanumeric characters - spaces, dashes, brackets, etc.
                # Should skip the loop if the array is empty.
                words = key.split(/[^[:alnum:]]+/)
                words.each_index { |i|
                    words[i].capitalize!
                }

                # You should use words.flatten.join('') to account for nested fields,
                #  but this should never happen, so I'd rather avoid the overhead.
                new_key = words.join('')

                # Rename the resulting field
                event.set(
                    target_container + '[' + new_key + ']',
                    event.remove(target_container + '[' + key + ']')
                )
            }

            # Exception handling
            rescue Exception => e
                event.set('logstash_ruby_exception', 'capitalize: ' + e.message)
        "
    }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.