How to dinamically replace - with _ in nested field names

i'm try to perform a field name replacement dynamically.
I do not know ahead of time how many fields nr what they are.

i found this in an elasticsearch pipeline but i need to do the exact same in logstash

# Converts all kebab-case key names to snake_case
  - script:
      lang: painless
      source: >-
        ctx.juniper.srx = ctx?.juniper?.srx.entrySet().stream().collect(Collectors.toMap(e -> e.getKey().replace('-', '_'), e -> e.getValue()));

basically, i need to rename

myfield.field-1 --> myfield.field_1
myfield.field-1-1 --> myfield.field_1_1
myfield.field-2 --> myfield.field_2
myfield.field-n --> myfield.field_n

etc

Hello, I faced a similar case where I had to change field name but using reindex API, in your case you can't update the field name in the same index once the index is created, but I think of 3 approaches:

  1. reindex in a new index with the appropriate field name, the script will be (didn't test it for your case):
POST _reindex
{
  "source": {
    "index": "sourceindex"
  },
  "dest": {
    "index": "new-index"
  },
  "script": {
    "source": """
            if (ctx._source.myfield.field_1 == null) {
            ctx._source.myfield = new HashMap();
            }

            if (ctx._source.myfield.field-1 != null) {
            ctx._source.myfield.field_1 = ctx._source.remove("myfield.field-1");
            } else {
            ctx._source.myfield.field_1 = "null"
            }
            """,
  "lang": "painless"
  }
}
  1. use update_by_query (tested it and worked)
POST my_index/_update_by_query
{
  "script": {
    "source": """
ctx._source.myfield.field_1 = ctx._source.myfield.remove('field-1'); 
      ctx._source.myfield.field_2 = ctx._source.myfield.remove('field-2') #etc.
"""
  }
}

test:

# indexed a sample document
POST my_index/_doc
{
  "myfield": {
    "field-1": "value 1"
  }
}
# and use update by query
POST my_index/_update_by_query
{
  "script": {
    "source": "ctx._source.myfield.field_1 = ctx._source.myfield.remove('field-1');"
  }
}
  1. update the mapping of your original index and index back your data source or your pipeline that feed your index

Hope it helps,
Marwane.

This is an interesting approach.
Unfortunately my output is not elasticsearch but Microsoft defender so I need to do this within logstash.

Furthermore, since I need to handle high rate of traffic that option would likely not scale in terms of speed.

I was thinking that a ruby filter would do the work, but I don't know ruby :pensive:

Yes you would need a ruby function that recursively modifies field name. You could try

    ruby {
        init => '
            def doSomething(object, name, keys, event)
                if object
                    if object.kind_of?(Hash) and object != {}
                        object.each { |k, v| doSomething(v, "#{name}[#{k}]", keys, event) }
                    elsif object.kind_of?(Array) and object != []
                        object.each_index { |i|
                            doSomething(object[i], "#{name}[#{i}]", keys, event)
                        }
                    else
                        if name =~ /-1\]$/
                            newName = name.gsub(/-1\]$/, "_1]")
                            event.set(newName, event.remove(name))
                        end
                    end
                end
            end
        '
        code => 'event.to_hash.each { |k, v| doSomething(v, "[#{k}]", @field, event) }'
    }
1 Like

I see, well I don't know how the Microsoft defender output is, but while you are processing the data with logstash, you can use mutate filter plugin, like this:

input {
  # your input..
}

filter {
  mutate {
    rename => { "myfield.field-1" => "myfield.field_1" }
    rename => { "myfield.field-2" => "myfield.field_2" }
  }
}

output {
  # your output..
}

Keep in mind that mutate filter plugin modifies the event data in place, so it will overwrite the original field names. If you want to keep the original field names, you can use the clone filter plugin to create a copy of the event before modifying it.

To be honest, personnally I try to avoid ruby as much as possible to keep the logstash pipelines simple and easy to read, but in your case it's only about updating field name maybe do something like this:

filter {
  ruby {
    code => "
      event.set('[myfield][field_1]', event.get('[myfield][field-1]'))
      event.set('[myfield][field_2]', event.get('[myfield][field-2]'))
     # removing fields with '-'
      event.remove('[myfield][field-1]')
      event.remove('[myfield][field-2]')
    "
  }
}
1 Like

That would absolutely be great but as I stated I don't know what fields nor how many I will have except that in the best case will be 10 fields, and worst case will be 60...

I'm receiving data as field1=value field2=value etc, that I break up woth kv

Thanks, let me try this out!

I see, in this case @Badger answer is much better didn't think about such an approach :smile:

I'm trying to understand the script but I'm missing where I define the root field the script looks for..

I.e. With

myfield.field-1

Where do I define "myfield"?

You do not have to. This iterates over every field in the event

    code => 'event.to_hash.each { |k, v| doSomething(v, "[#{k}]", @field, event) }'

If you only want to process [myfield] then you could try

code => 'doSomething(event.get("myfield"), "[myfield]", "", event)'

Note that the third argument to doSomething() is not used.

1 Like

perfect, the other change i needed to do was the following

                      if name =~ /-/
                          newName = name.gsub(/-/, "_")

now it works as expected, the fullo code

  ruby {
    init => '
      def doSomething(object, name, keys, event)
        if 
          if object.kind_of?(Hash) and object != {}
            object.each { |k, v| doSomething(v, "#{name}[#{k}]", keys, event) }
          elsif object.kind_of?(Array) and object != []
            object.each_index { |i|
              doSomething(object[i], "#{name}[#{i}]", keys, event)
            }
          else
            if name =~ /-/
              newName = name.gsub(/-/, "_")
              event.set(newName, event.remove(name))
            end
          end
        end
      end
    '
    code => 'doSomething(event.get("[myfield]"), "[myfield]", "", event)'
  }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.