hello every one , i have a file which contain 3 types of log :
1 )
X.X.X.X - - [04/May/2018:16:24:21 +0200] "GET /WebServiceRequestProxyCenter/api/request-url?country=fr&url=https%3A%2F%2Fwww.google.fr%2Fsearch%3Fq%3Disolation%2520thermique%26start%3D70%26ie%3DUTF-8%26oe%3DUTF8%26hl%3Dfr&service=new-rescan-rank%2C+siniat%2C+https%3A%2F%2Fwww.siniat.fr%2C+isolation+thermique HTTP/1.1" 200 92866
X.X.X.X - - [04/May/2018:15:57:26 +0200] "GET /WebServiceRequestProxyCenter/api/request-url?service=semantic_analyze_missing%2C+demo14%2C+friteuse+electrique+pas+cher&url=https%3A%2F%2Fwww.google.fr%2Fsearch%3Fq%3Dsite%253Awww.friteuses.info%2Bfriteuse%2Belectrique%2Bpas%2Bcher%26ie%3DUTF-8%26oe%3DUTF8%26hl%3Dfr%26start%3D0%26num%3D10&country=fr HTTP/1.1" 200 104563
X.X.X.X - - [04/May/2018:16:53:53 +0200] "GET /WebServiceRequestProxyCenter/monitor HTTP/1.1" 200 604
(X.X.X.X = it's an ip address )
for this time , i wrote a grok :
filter {
if [clientip] == "X.X.X.X" { grok { match => { "message" => ['%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request}\?%{DATA:request2}\&%{DATA:request3}HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})'] } }
}
if [clientip] == "X.X.X.X" {grok {
match => { "message" => ['%{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "%{WORD:verb} %{DATA:request}HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})']
}
}
}}
grok {match => { "message" => ['%{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "%{WORD:verb} %{DATA:request}?%{DATA:request2}&%{DATA:request3}&%{DATA:request4}HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int})']
}}
but this grok it's something manual and the result it's not uniformed , for example :
result for log type 2)
here we can see that country it's allowed for variable "request4"
result for log type 1)
we can see that country it's allowed variable "request2"
so the question is : i want to know if there is an automatic method to extract and split from all my logs to get :
request1 = /WebServiceRequestProxyCenter/api/request-url
request2 = service
request3 = url
request4 = country
?