Use filters to extract info from access logs

Hi,

I want to know if its possible to have filters that will extract info such as Status Code, IP, website, time from an access log.

And is it better to do this in Log or Discover board?

Or should I configure this in filebeat.conf
Here is a sample

192.168.10.182 - - - 12/Nov/2019:00:00:00 -0800 "GET /xxxx/xxxx/xxx/xx/xxxxx HTTP/1.1" 200 285 
GET /xxxx/xxxx/xxx/xx/xxxx HTTP/1.1
User-Agent: Java/0000
Connection: keep-alive
Host: xxxxx:00000
Accept: application/json
Content-Type: application/json

HTTP/1.1 200 OK
Date: Tue, 12 Nov 2019 08:00:00 GMT
Content-Type: application/json

Hi @Mehak_Bhargava,

It sounds more like Filebeat/Ingest related question, not Kibana. I'll transfer this issue to the appropriate hub. Having said that it seems you'll need to use Ingest node to parse your data.

Best,
Oleg

Hi @azasypkin,

Thanks for updating the forum. I see I have to use GET pipelines for my task here. But i am not sure where this pipeline.json file is supposed to be stored in. There is already a pipeline.json file in filebeat/module/apache/error/ingest.

https://www.elastic.co/guide/en/elasticsearch/reference/7.4/get-pipeline-api.html

@azasypkin

Here is a part of the pipeline.json file already in filebeat folder-

{
    "description": "Pipeline for parsing apache error logs",
    "processors": [
        {
            "grok": {
                "field": "message",
                "patterns": [
                    "\\[%{APACHE_TIME:apache.error.timestamp}\\] \\[%{LOGLEVEL:log.level}\\]( \\[client %{IPORHOST:source.address}(:%{POSINT:source.port})?\\])? %{GREEDYDATA:message}",
                    "\\[%{APACHE_TIME:apache.error.timestamp}\\] \\[%{DATA:apache.error.module}:%{LOGLEVEL:log.level}\\] \\[pid %{NUMBER:process.pid:long}(:tid %{NUMBER:process.thread.id:long})?\\]( \\[client %{IPORHOST:source.address}(:%{POSINT:source.port})?\\])? %{GREEDYDATA:message}"
                ],
                "pattern_definitions": {
                    "APACHE_TIME": "%{DAY} %{MONTH} %{MONTHDAY} %{TIME} %{YEAR}"
                },
                "ignore_missing": true
            }
        },
        {
            "date": {
                "field": "apache.error.timestamp",
                "target_field": "@timestamp",
                "formats": [
                    "EEE MMM dd H:m:s yyyy",
                    "EEE MMM dd H:m:s.SSSSSS yyyy"
                ],
                "ignore_failure": true
            }

And I want to create a file which would look like this-

{
  "pipeline" :{
    "processors" : [
	{
	"grok" : {
	   "feild": "message",
 	   "pattern" : "%{COMMONAPACHELOG}"
	}
       },
       {
	  "date" : {
		"match_field" : "timestamp",
		"match_formats" : ["dd/MM/YYYY: HH:mm:ss"]
	  }
	},
	{
	  "remove" : {
	     "feild" : "message"
         }
        }
       ]
      },
      "docs" : [
	{
	  "message" : "192.168.10.182 - - - 14/Nov/2019:00:00:05 -0800 "GET /stats-data-server/statsdataserver/kpi/db/storage HTTP/1.1" 200 285 "

Where should I* create and store this json file? Should I delete the reformat the original one?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.