Configure two types of data in logstash filter

This is my first data

IP Adress - - [01/Jul/2019:15:36:10 +0300] "POST /search?page=1&page_size=25 HTTP/1.1" 200 78 "https://try.com/search/?contains=pflichtteilauskunft" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0" "IP Adress" "DE"

This is my second data

IP Adress - - [01/Jul/2019:10:31:41 +0300] "POST /search?page=2&page_size=100 HTTP/1.1" 200 3163 "https://try.com/search/?on_sale=y&price_max=3000&sale_type=1,3&extension=com&length=1-15&hyphen=n&number=n&idn=n&cdate_min=19971105&sort=cdate_a&page_s=100&page=2" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36" "IP Adress" "BD"

I want to seperate this two data into different field. For example; for first of my data I want to seperate the values that come after "try.com/search/?". I want to have "contains" field and the fields value need to be "pflichtteilauskunft"

For second of my data I want to seperate the values that come after "try.com/search/?" via "&" sign. I need to have on_sale field with its value and I need to have price_max with its value ....

This is my filter for now:

filter {


   if[method]=="POST"
{

  grok {
     match => {
       "message" => '%{DATA:user_name} \[%{HTTPDATE:time_local}\] "%{WORD:method} %{DATA:type}?%{WORD:page}=%{DATA:page_size}&%{DATA: referrer_page_size}=%{DATA:total_page} HTTP/%{NUMBER:http_version}" %{NUMBER:response_code} (?:%{NUMBER:bytes}|-) "%{DATA:connection}//%{DATA:search}/%{DATA:try2}/\?%{DATA:typeof_query}=%{DATA:looked_for}" "%{DATA:agent}" "%{DATA:http_x_forwarded_for}" "%{DATA:country}"'

     }
  }

}
}

First off, I'll set the disclaimer that I am very new with this but, I think I may have somewhere for you to start. My current project takes me into this very same territory and I have had some luck with using the "NOTSPACE" pattern within grok. From within Grok Debugger, I used the following pattern:

`\"%{NOTSPACE}contains=%{WORD:pflich}`

which returns:

{
  "NOTSPACE": [
    "https://try.com/search/?"
  ],
  "pflich": [
    "pflichtteilauskunft"
  ]
}

Its not exactly what you wanted but, I figured I would offer what I have learned so far.

Happy Groking!!

I would suggest

    grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }
    grok { match => { "referrer" => '\?%{GREEDYDATA:[@metadata][args]}"' } }
    kv { source => "[@metadata][args]" field_split => "&" }

If you do not want all the extra fields that COMBINEDAPACHELOG creates then you could modify the stock patterns and replace the first grok with

    grok {
        pattern_definitions => {
            CUSTOM_COMMONLOG => '%{IPORHOST} %{HTTPDUSER} %{HTTPDUSER} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE}(?: HTTP/%{NUMBER})?|%{DATA})" %{NUMBER} (?:%{NUMBER}|-)'
            CUSTOM_COMBINEDLOG => '%{CUSTOM_COMMONLOG} %{QS:referrer} %{QS}'
        }
        match => { "message" => "%{CUSTOM_COMBINEDLOG}" }
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.