Kv filter killing logstash performance (II) [Solved]

monitoring

(Javier Leal) #1

Hello. I have exactly the same problem like KV filter killing logstash performance but I continue on this post due to I cannot reply on it.

I mean, by using KV filter, the CPU go after some hours at 100%. If I disable KV filter, the CPU usage down to normal usage. So, only disabling KV filter, all works fine.

This is my KV filter:

  kv {
     include_keys => [ "HTTP_REFERER", "REQUEST_URI" ]
#the keys above can be: "HTTP_REFERER=the-referer.html" OR "HTTP_REFERER:the-referer.html"
     value_split_pattern => "=|:"
  }
}

In the same message can exists up to 3 times repeated with differents patterns as follow:

REQUEST_URI=/the/url.html
Request_Uri: /the/url.html
[REQUEST_URI] => /the/url.html

(Javier Leal) #2

Is there any way to replace KV filter by a GROK filter?


(Lewis Barclay) #3

There is but is not as nice. Whats the specs of the machine?


(Javier Leal) #4

I'm trying this grok

  grok {
    match => { 
      "message" => [
        "REQUEST_URI=%{URIPATHPARAM:request_uri}"
      ]
    }
  }

but it doesn't work neither for

Stack trace:
#0 /var/www/framework/yii-1.1.19/yiilite.php(1695): CWebApplication->runController('articulos/cienc...')
#1 /var/www/framework/yii-1.1.19/yiilite.php(1212): CWebApplication->processRequest()
#2 /var/www/protected/components/WebApplicationEndBehavior.php(27): CApplication->run()
#3 /var/www/framework/yii-1.1.19/yiilite.php(709): WebApplicationEndBehavior->runEnd('front')
#4 /var/www/index.php(34): CComponent->__call('runEnd', Array)
#5 {main}
REQUEST_URI=/articulos/ciencia-y-tecnologia/t1326/test?
---
IP: 2a01:7e01:0:0:f03c:91ff:fefb:bsde

(Lewis Barclay) #5

You would need to post your full log lines


(Javier Leal) #6

Thanks @Eniqmatic for your help, but yet is solved.
I write it for future references:

patterns_dir:

#Fechas:  "[2018-12-31 12:12:12]", "2018-12-31 12:12:12", "[2018/12/-31 12:12:12]", "2018/12/-31 12:12:12", "---- + FECHA"
TIMESTAMP_1 ^(?:----)?(?:\[)?\d{4}[\/-](?:\d{2}|\w{3})[\/-]\d{2}\s\d{2}:\d{2}:\d{2}
#Fecha [24-Apr-2018 12:12:12]
TIMESTAMP_2 ^(?:----)?\[\d{2}-\w{3}-\d{4}\s\d{2}:\d{2}:\d{2}\]
#Merge TIMESTAMP_1 and 2
VARIOS_TIMESTAMPS %{TIMESTAMP_1}|%{TIMESTAMP_2}

filter:

 #GROKs matches must be splitted in multiples entries. These Groks replaces KV filter
  grok {
    patterns_dir => ["/etc/logstash/conf.d/patrones-grok"]
    match => { 
       "message" => "%{VARIOS_TIMESTAMPS:tempLogTimestamp}"
    }
  }
 #These GROKs replaces KV filter which is downing CPU performance
  if [message] =~ /(?i)\bREQUEST_URI\b/ {
    grok {
      match => { 
        "message" => "REQUEST_URI=%{URIPATHPARAM:request_uri}"
      }
    }  
  }
  if [message] =~ /(?i)\bIP\b/ {
    grok {
      match => { 
        "message" => "IP:\s?%{IP:ip}"
      }
    }  
  }

(Javier Leal) #7

By nesting KV filter in an IF sentence, the CPU performance worked nice:

if [message] =~ /(?i)\bREQUEST_URI\b/ {
    grok {
      match => { 
        "message" => "REQUEST_URI=%{URIPATHPARAM:request_uri}"
      }
    }  
  }