Parse different time duration formats/units to common format

Hi there,

Part of my input json looks as follows:

  "phaseTimes": {
    "authorize": "28.201µs",
    "filter": "27.522068ms",
    "indexScan": "2.004642056s",
    "instantiate": "40.002µs",
    "run": "2.041368619s"
  }

Note the different units, which might be seconds (s), milliseconds (ms), microseconds (µs), and, potentially, minutes (m).

How would I be able to parse these and get them into some common format, say basic milliseconds or microseconds -- without any unit.

So my expected output is something like this (here, formatted to milliseconds):

  "phaseTimes": {
    "authorize": 0.028201,
    "filter": 27.522068,
    "indexScan": 2004.642056,
    "instantiate": 0.040002,
    "run": 2041.368619
  }

You can use a ruby filter to change all values to requested format, after assigning it to a field, Try this below filter for authorize field as example

ruby {
code => "event[authorize'] = ((event['authorize][0..-2] ).to_i /1000)"
}

The best way will be to use a ruby filter with some code to convert the data.

Those the unites can change on a document basis? For example, the phaseTimes.authorize will always be in micro seconds or it can be in miliseconds or seconds as well?

If so, you will need to check the unit in your ruby code to convert it correctly.

Try

    ruby {
        init => '
            Factors = {
                "m" => 60000.0,
                "s" => 1000.0,
                "ms" => 1,
                "µs" => 0.001
            }
        '
        code => '
            times = event.get("phaseTimes")
            if times.is_a? Hash
                times.each { |k, v|
                    suffix = /[[:alpha:]]+$/.match(v).to_s
                    factor = Factors[suffix]
                    if factor
                        times[k] = factor * v.to_f
                    end
                }
                event.set("phaseTimes", times)
            end
        '
    }

which produces

"phaseTimes" => {
      "indexScan" => 2004.6420559999997,
            "run" => 2041.368619,
         "filter" => 27.522068,
      "authorize" => 0.028201,
    "instantiate" => 0.040002
},

.to_f ignores the alpha suffix, so you do not need to split the v into digits and suffix using the regexp, although stylistically that might be better.

1 Like

Wow, this is brilliant, thanks all for taking me in the right direction and for the code @Badger .

I ended up coding the following, but your solutions seems more elegant.

def str_to_ms(val)

  if val.end_with?("ms") then
    return val[0..-3].to_f
  elsif val.end_with?("µs") then
    return val[0..-3].to_f / 1000
  elsif val.end_with?("s") then
    return val[0..-2].to_f * 1000
  elsif val.end_with?("m") then
    return val[0..-2].to_f * 1000 * 60
  elsif val.end_with?("h") then
    return val[0..-2].to_f * 1000 * 60 * 60
  end

end

def filter(event)

  fields = ["authorize", "filter", "indexScan", "instantiate", "run" ]
  fields.each do |key|


    val = event.get(key)

    if !val.nil? then
      event.set(key, str_to_ms(val))
    end

  end

  return [ event ]
end

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.