Nested grok filter or filtering a field using grok


(Pavan Kumar E) #1

Hello,

So I have a CommonApache access log format in the below form:

8.6.7.9 - - [10/Sep/2017:05:17:11 +0000] "POST /integration/servletloadserviceupdate HTTP/1.0" 200 9157

So I have used the below filter:

filter {
 grok {
    match => { "message" => ["%{COMMONAPACHELOG}"]} 
  }
  
 mutate {
    remove_field => [ "ident", "auth" ]
  } 
 }

There is a field called [apache2].[access].[url], which basically contains this:

/integration/clearcase/listpatchtobedone.jsp

I need to split this, and create a new field with name "application" and with value "clearcase" (taken from the above URL)

I have arrived at this as the second last element from the URL taking / as delimiters.

So my question is, is this possible ? Maybe with a second grok right below the first one that matches the URL field. In this case, how is it done exactly ?

I am using the complete stack of v5.5.

Regards,
pavan


(Magnus Bäck) #2

Untested:

grok {
  match => ["[apache2][access][url]", "/(?<application>[^/]+)/[^/]+$"]
}

(Pavan Kumar E) #3

Hi Magnus,

Thank you very much for your quick reply.

So now, I'm trying to change the @timestamp to the timestamp present in my log.

My log is as below:

0.0.0.0 - - [20/Sep/2017:06:10:57 +0000] "GET /integration/spin/SWAT/scenario/launchShoot.jsp HTTP/1.0" 200 16184

This is the response i see on Kibana:

{
  "_index": "test",
  "_type": "log",
  "_id": "someid123456-olK",
  "_version": 1,
  "_score": null,
  "_source": {
    "request": "/integration/spin/SWAT/scenario/launchShoot.jsp",
    "offset": 73886,
    "input_type": "log",
    "verb": "GET",
    "source": "\\\\localhost\\ETV_G\\logs\\wls\\intranet\\intranetNode03\\system\\access.log",
    "message": "0.0.0.0 - - [20/Sep/2017:06:10:57 +0000] \"GET /integration/spin/SWAT/scenario/launchShoot.jsp HTTP/1.0\" 200 16184 ",
    "type": "log",
    "tags": [
      "beats_input_codec_plain_applied",
      "_dateparsefailure"
    ],
    "@timestamp": "2017-09-20T06:20:24.283Z",
    "response": "200",
    "bytes": "16184",
    "clientip": "0.0.0.0",
    "@version": "1",
    "beat": {
      "hostname": "localhost",
      "name": "localhost",
      "version": "5.5.2"
    },
    "host": "localhost",
    "httpversion": "1.0",
    "timestamp": "20/Sep/2017:06:10:57 +0000"
  },
  "fields": {
    "@timestamp": [
      1505888424283
    ]
  },
  "sort": [
    1505888424283
  ]
}

I tried the below filter, the @timestamp is still different from the log timestamp.

input {
  beats {
    port => 5044
    host => "0.0.0.0"
  }
}

# The filter part of this file is commented out to indicate that it is
# optional.
filter {
 grok {
    match => { "message" => ["%{COMMONAPACHELOG}"]} 
    #remove_field => [ "message","@timestamp" ]
  }

   date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss +Z" ]
    target => [ "@timestamp" ]
    }
  
  #geoip {
   # source => "clientip"
  #}

 mutate {
    remove_field => [ "ident", "auth" ]
  } 
 }

output {
	#stdout { codec => rubydebug }
  elasticsearch {
    hosts => "localhost:9200"
    action => "index"
    index => "test"
  }
}

(Magnus Bäck) #4

The Kibana result above, is that with the date filter above?


(Pavan Kumar E) #5

Hi again Magnus,

Sorry for the late reply, and my mistake.

I have now edited the above post with the exact info that my server has, and is giving me.

Looks like Logstash is saying there is a date parse failure. Not sure, what I'm doing wrong.

Regards,
Pavan


(Pavan Kumar E) #6

Hi Magnus,

I think it is loading correctly. I hadn't realized the time difference between the server and my machine where I was accessing the kibana on browser.

However, I see a _dateparsefailure tag being added in the resulting json. Can you please help me debug that?

Regards,
Pavan


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.