Multiple matches in Grok filter

Hello!

I trying to parse search requests to one of my elastic database and i don`t know what to do when a two- or more-word search come.
Config:
input {
beats {
port => 5044
}
}
filter {
if "bulk" in [url][path] {
drop { }
}
if "search" in [request] {
grok {
match => { '[http][request][body][content]' => '(?<search_pattern>(?<=query\": \")\w{1,})'}
}
grok {
match => { "[url][path]" => "/(?.*)shared
.\d_.\d/"}
}
mutate {
update => { "query_full" => "{%{query_body}" }
}
}
}
output {
if "search" in [request] and "ignore_unmapped" not in [query_body]{
elasticsearch {
hosts => "rndtfsstgtools2:9200"
}
}
}

Request body:
POST /workitemsearchshared_0_2/workItemContract/_count?routing=f97e12fe-a1c6-41e6-b004-8bcec4a02473%2Cef42fcd5-a256-4df5-b832-ddc971a538ca&terminate_after=51 HTTP/1.1\r\nAccept: application/json\r\nContent-Type: application/json\r\nAuthorization: Basic ZWxhc3RpY3VzZXI6TGFtYTNsZXI3\r\nHost: rndtfsstgsearch:9200\r\nContent-Length: 4169\r\nConnection: Keep-Alive\r\n\r\n{"query":{\r\n "bool": {\r\n "must": {\r\n "bool": {\r\n "must": [\r\n {\r\n "multi_match": {\r\n "query": "credentials",\r\n "fields": [\r\n "fields.str","fields.str.stemmed","fields.html","fields.html.stemmed","fields.path","fields.str|title|system$","fields.str|title|system$.stemmed","fields.str|tags|system$","fields.str|tags|system$.stemmed","fields.path|areapath|system$","fields.path|areapath|system$.stemmed","fields.path|iterationpath|system$","fields.path|iterationpath|system$.stemmed","fields.str|state|system$","fields.str|state|system$.stemmed","fields.str|assignedto|system$","fields.str|assignedto|system$.stemmed","fields.str|createdby|system$","fields.str|createdby|system$.stemmed","fields.html|description|system$","fields.html|description|system$.stemmed","fields.html|history|system$","fields.html|history|system$.stemmed","fields.html|reprosteps|microsoft>vsts>tcm$","fields.html|reprosteps|microsoft>vsts>tcm$.stemmed","fields.html|steps|microsoft>vsts>tcm$","fields.html|steps|microsoft>vsts>tcm$.stemmed","fields.str|title|system$^10","fields.str|title|system$.stemmed^10","fields.html|description|system$^5","fields.html|description|system$.stemmed^5","fields.str|assignedto|system$^4"\r\n ],\r\n "type": "phrase"\r\n }\r\n },\r\n {\r\n "multi_match": {\r\n "query": "password",\r\n "fields": [\r\n "fields.str","fields.str.stemmed","fields.html","fields.html.stemmed","fields.path","fields.str|title|system$","fields.str|title|system$.stemmed","fields.str|tags|system$","fields.str|tags|system$.stemmed","fields.path|areapath|system$","fields.path|areapath|system$.stemmed","fields.path|iterationpath|system$","fields.path|iterationpath|system$.stemmed","fields.str|state|system$","fields.str|state|system$.stemmed","fields.str|assignedto|system$","fields.str|assignedto|system$.stemmed","fields.str|createdby|system$","fields.str|createdby|system$.stemmed","fields.html|description|system$","fields.html|description|system$.stemmed","fields.html|history|system$","fields.html|history|system$.stemmed","fields.html|reprosteps|microsoft>vsts>tcm$","fields.html|reprosteps|microsoft>vsts>tcm$.stemmed","fields.html|steps|microsoft>vsts>tcm$","fields.html|steps|microsoft>vsts>tcm$.stemmed","fields.str|title|system$^10","fields.str|title|system$.stemmed^10","fields.html|description|system$^5","fields.html|description|system$.stemmed^5","fields.str|assignedto|system$^4"\r\n ],\r\n "type": "phrase"\r\n }\r\n.....

Search query:

Matches:

How i can resolve it in regex or logstash or maybe i need use different scheme to resolve me Issue. And please sorry for my English.

So you are trying to parse the request body with logstash?
as you can see it is json formatted query beginning from "{"query":{" with this you can grok it out and parse it as json. this qould be a bit much logic, but you can do it also like this:

if [field_with_query] =~ /credentials/ and [field_with_query] =~ /password/ {
    #use a filter as you want like mutate to add tags or fields
}

Ok maybe I did understand you wrong.
So you can try to use a ruby filter to iterate on it. I really dont know how but I now it exists somewhere in this forum.

Or you create many grokpatterns trying to get like 10 queries then 9 then 8 ...
It is horrible and bad. but it works for some static amount :wink:

Hello, and thank you for very fast response!

I think you understand me right, and i tried to do like in yous first answer.
I changed my filter:

filter {
if "bulk" in [url][path] {
drop { }
}
if "search" in [request] {
grok {
match => { "[url][path]" => "/(?.*)shared
.\d_.\d/"}
}
json {
source => "[http][request][body][content]"
target => "true_json"
}
mutate {
add_field => { "Query_body" => "%{[true_json][query][bool][must][bool][must]}" }
}
mutate {
add_field => { "Repositories" => "%{[true_json][query][bool][filter]}" }
}
}
}

And it working, just like you said.
its not perfect, but i can work with it:
2019-05-28_10-06-57

Thanks a lot for your help!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.