Grok different types of log in one file


(kim) #1

hello every one , i have a file which contain 3 types of log :

1 )

X.X.X.X - - [04/May/2018:16:24:21 +0200] "GET /WebServiceRequestProxyCenter/api/request-url?country=fr&url=https%3A%2F%2Fwww.google.fr%2Fsearch%3Fq%3Disolation%2520thermique%26start%3D70%26ie%3DUTF-8%26oe%3DUTF8%26hl%3Dfr&service=new-rescan-rank%2C+siniat%2C+https%3A%2F%2Fwww.siniat.fr%2C+isolation+thermique HTTP/1.1" 200 92866

X.X.X.X - - [04/May/2018:15:57:26 +0200] "GET /WebServiceRequestProxyCenter/api/request-url?service=semantic_analyze_missing%2C+demo14%2C+friteuse+electrique+pas+cher&url=https%3A%2F%2Fwww.google.fr%2Fsearch%3Fq%3Dsite%253Awww.friteuses.info%2Bfriteuse%2Belectrique%2Bpas%2Bcher%26ie%3DUTF-8%26oe%3DUTF8%26hl%3Dfr%26start%3D0%26num%3D10&country=fr HTTP/1.1" 200 104563

X.X.X.X - - [04/May/2018:16:53:53 +0200] "GET /WebServiceRequestProxyCenter/monitor HTTP/1.1" 200 604

(X.X.X.X = it's an ip address )

for this time , i wrote a grok :

filter {

  if [clientip] == "X.X.X.X" {
       grok {

            match => {      "message" => ['%{IPORHOST:clientip} %{USER:ident} %{USER:auth}       \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request}\?%{DATA:request2}\&%{DATA:request3}HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})']      }  } 

}
if [clientip] == "X.X.X.X" {

grok {
match => { "message" => ['%{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "%{WORD:verb} %{DATA:request}HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})']
}
}
}

}
grok {

match => { "message" => ['%{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "%{WORD:verb} %{DATA:request}?%{DATA:request2}&%{DATA:request3}&%{DATA:request4}HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})']
}

}

but this grok it's something manual and the result it's not uniformed , for example :

result for log type 2)

here we can see that country it's allowed for variable "request4"
result for log type 1)

we can see that country it's allowed variable "request2"

so the question is : i want to know if there is an automatic method to extract and split from all my logs to get :
request1 = /WebServiceRequestProxyCenter/api/request-url
request2 = service
request3 = url
request4 = country
?


(kim) #2

any help please ?


#4

For something highly structured like a web server logs I would use dissect rather than grok. Personally I would probably mutate+split the query string to get an array, but...

    dissect { mapping => { "message" => '%{ip} %{ident} %{auth} [%{ts}] "%{method} %{uriAndQs} %{protocol}" %{status} %{bytes}' } }
    if [uriAndQs] !~ /\?/ {
        mutate { rename => { "uriAndQs" => "uri" } }
    } else {
        dissect { mapping => { "uriAndQs" => "%{uri}?{%{qs}" } }
        grok {
            match => { "qs" => [
                    "^(?<r1>[^&]+)&(?<r2>[^&]+)&(?<r3>[^&]+)$",
                    "^(?<r1>[^&]+)&(?<r2>[^&]+)$",
                    "^(?<r1>[^&]+)$"
                ]
            }
        }
    }

(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.