Parse ISO 8601 duration format (PT(n)H(n)M(n)S)

Hi!

I have a ISO 8601 duration format P(n)Y(n)M(n)DT(n)H(n)M(n)S fields as

"duration"=>"PT0H0M0S"

I have to parse it to get time duration in seconds or in format HH:MM:SS.

How can I do it with logstash filter?

Thank You!

You can use grok to parse it

grok { match => { "duration" => "^P(%{NUMBER:[@metadata][duration][y]}Y)?(%{NUMBER:[@metadata][duration][m]}M)?(%{NUMBER:[@metadata][duration][d]}D)?(T(%{NUMBER:[@metadata][duration][h]}H)?(%{NUMBER:[@metadata][duration][min]}M)?(%{NUMBER:[@metadata][duration][sec]}S)?)?" } }

If a group is surrounded by ()? then it can occur zero or more times, i.e. it is optional. Each of the six fields field is parsed by an expression like

(%{NUMBER:[@metadata][duration][y]}Y)?

Plus the Time part is optional, so there is an extra ()? around that.

Note that this has no problem with an invalid specification like "P" with no fields defined. Also NUMBER does not allow comma in numbers so it will fail for "P0,3H", but "P0.333H" is OK. If you need to accept commas it gets more complicated.

No duration that includes years, months or days can be unambiguously converted to seconds (think leap second and leap year) unless it is anchored to form a time interval. To do the conversion for the T fields you can use

    ruby {
        code => '
            hours = event.get("[@metadata][duration][h]").to_f
            mins = event.get("[@metadata][duration][min]").to_f
            secs = event.get("[@metadata][duration][sec]").to_f
            event.set("durationInSecs", 3600 * hours + 60 * mins + 1 * secs)
        '
    }

I get "tags"=>["_grokparsefailure"] with the grok code.

My data structure looks like below example (duration is a dynamic field)

{
    "@timestamp" => 2022-06-08T06:33:28.295820400Z,
      "id" => "xxxxx",
      "duration" => "PT0H0M0S",
        "complement" => "pl"
}
{
    "@timestamp" => 2022-06-08T06:33:28.295820400Z,
      "id" => "xxxxx",
      "end-position" => "PT0H0M1S",
        "complement" => "pl"
}
{
    "@timestamp" => 2022-06-08T06:33:28.295820400Z,
      "id" => "xxxxx",
      "position" => "PT0H0M0S",
        "complement" => "pm"
}
...

With trace option I have no indication about this tag.

The problem seems to be ^P and T.
I use this modification :

 grok { match => { "duration" =>"(%{GREEDYDATA:[@metadata][duration][PT]}PT)?(%{NUMBER:[@metadata][duration][h]}H)?(%{NUMBER:[@metadata][duration][min]}M)?(%{NUMBER:[@metadata][duration][sec]}S)?"}
 

and it works.

Thank you!!!

Not sure why you needed to change it. It works for me.

      "duration" => "PT0H0M1S",
     "@metadata" => {
    "duration" => {
          "h" => "0",
        "sec" => "1",
        "min" => "0"
    }
},
"durationInSecs" => 1.0

I just tried again with another field but it the same problem. May be some thing in my data.

Thank You very much!!!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.