grok/KV filter for multiple values

For the nginx extended logs, I am using below grok pattern and KV pattern. But the value sometimes comes as '-' , some time real values and hypen + value.

%{GREEDYDATA:[nginx][access][message]}'

Log - 3 types
rt=0.006 uct="-, 0.000" uht="-, 0.006" urt="0.001, 0.006" csi="-"
rt=0.016 uct="-" uht="-" urt="-" csi="-" ua="-" us="-"
rt=0.003 uct="0.000" uht="0.001" urt="0.001" csi="-" ua="-" us="304"

  kv {
    source => [ "[nginx][access][message]" ]
    remove_field => ["[nginx][access][message]" ]
    field_split => " "
    value_split => "="
    trim_key => " "
    trim_value => " "
    target => "kv_temp"
  }
  if [kv_temp] {
    mutate {
      merge => { "[nginx][access]" =>  "kv_temp" }
      remove_field => "kv_temp"
    }
  }

How do i drop any field which has hypen ?
How do i sepearate the value if that contains - and values by dropping hypen ?
How i convert them to Integer or Number ?

If you use trim_value => "-, " it will remove the hyphens. You can use mutate+split to get an array from urt="0.001, 0.006". You can use mutate+convert to convert the fields to integer or float. To remove the empty fields you can use ruby

    ruby {
        code => '
            event.get("kv_temp").each { |k, v|
                if v == ""
                    event.remove("[kv_temp][#{k}]")
                end
            }
        '
    }

Thanks Much for the suggestion, I ended up using two pattern and split rather than KV split. like

us="(%{INT:[nginx][access][upstream][status]:int}|-)"

and if grok failure

us="(%{DATA:[nginx][access][upstream][status]}|-)"

Then a split for to separate values

split => [ "[nginx][access][upstream][status]", "," ]

Then a convert to convert it to integer inside the if loop of grok failure.

"[nginx][access][upstream][status]" => "integer"

Now all working fine but could you please modify the ruby code check in all fields and if that any thing matches to hyphen - then remove that field ?

If you make your grok pattern more specific than DATA then it simply will not capture - and you will not have to worry about removing it.

How to apply grok after split or on split value ?

What does your data look like and what does your configuration look like?

Data look like above

and grok pattern for that is

us="(%{INT:[nginx][access][upstream][status]:int}|-)"

or

us="(%{DATA:[nginx][access][upstream][status]}|-)"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.