Is it possible to index only the matched log lines of grok in Logstash?


(Kulasangar Gowrisangar) #1

I'm having a log file, which actually has INFOs' and ERRORs. So I tried to match only the needful INFOs by using the grok filter. So this is how my log lines look like. Few of them from the file.

And this is how my grok look like in my logstash conf:

grok {
		patterns_dir => ["D:/elk_stack_for_ideabiz/elk_from_chamith/ELK_stack/logstash-5.1.1/bin/patterns"]
		match => { 
			"message" => [
				"^TID\: \[0\] \[AM\] \[%{LOGTIMESTAMPTWO:logtimestamp}]%{REQUIREDDATAFORAPP:app_message}",
				"^TID\: \[0\] \[AM\] \[%{LOGTIMESTAMPTWO:logtimestamp}]%{REQUIREDDATAFORRESPONSESTATUS:response_message}"
			] 	
		}
	}

The pattern seems to be working fine. I could provide the pattern if required.

I've got two questions. One is I wanted only the grok matched lines to be sent to the index, and prevent Logstash from indexing the non-matched ones and the second is to prevent Logstash from showing the message in every single ES record.

I tried using the overwrite as such under the match but still no luck:

overwrite => [ "message" ]

All in all what I need to see in my indice are the messages (app_message, response_message from the above match), which should match the above two conditions. Where as now, all the lines are getting indexed.

Is it possible do something like this? Or does Logstash index all of them by default?

Where am I going wrong? Any help could be appreciated.


(Magnus Bäck) #2

Add remove_field => ["message"] to your grok filter to remove the message field if the filter is successful.

To drop events where grok failed:

if "_grokparsefailure" in [tags] {
  drop { }
}

(Kulasangar Gowrisangar) #3

Thanks @magnusbaeck for the response :slight_smile:

Well removing the message did work!

But I'm still getting the unnecessary lines from the log. I might have to drop here the pattern which I'm using for each line:

The log lines and patterns respectively:

TID: [0] [AM] [2016-12-24 23:59:59,593] INFO {org.apache.synapse.mediators.builtin.LogMediator} - API Request URL = /subscription/v3/subscribe, Request ID = urn:uuid:70938535 {org.apache.synapse.mediators.builtin.LogMediator}

Pattern for the above:
REQUIREDDATAFORAPP (^.*API Request URL.*$) |[^\/]*\/([^\/]*)*\/[^\/]*\/

TID: [0] [AM] [2016-12-24 23:59:59,213] INFO {org.apache.synapse.mediators.builtin.LogMediator} - API Response Status = 200, Request ID = urn:uuid:a83a1760, Response Time(ms) = 436.0 {org.apache.synapse.mediators.builtin.LogMediator}

Pattern for the above:

REQUIREDDATAFORRESPONSESTATUS (^.*API Response Status.*$) |.+?(?=\s+\S*$)

The log line which shouldn't get matched or indexed:

TID: [0] [AM] [2016-12-24 23:59:59,593] INFO {org.wso2.carbon.apimgt.axiata.dialog.verifier.DialogAPIRequestHandler} - [2016-12-24 23:59:59] >>>>> API Request id 1482604199593MI584 {org.wso2.carbon.apimgt.axiata.dialog.verifier.DialogAPIRequestHandler}

I don't what I'm missing here. I only wanted to see the matched lines from the log in my indice. Is there something wrong with the regex?

Thanks again.


(Magnus Bäck) #4

Does the message that slipped through have a _grokparsefailure tag? If no, the grok filter was successful. Is it app_message or response_message that's populated with data? That'll tell us which expression that matched.


(Kulasangar Gowrisangar) #5

Thanks @magnusbaeck. :slight_smile:

No

It wasn't either the app_message or response_message.

The data has been populated with the message field. So what I did was:

grok {
	patterns_dir => ["D:/elk_stack_for_ideabiz/elk_from_chamith/ELK_stack/logstash-5.1.1/bin/patterns"]
	match => [ 
				"message", "^TID\: \[0\] \[AM\] \[%{LOGTIMESTAMPTWO:logtimestamp}\]%{REQUIREDDATAFORAPP:message}",	
				"message", "^TID\: \[0\] \[AM\] \[%{LOGTIMESTAMPTWO:logtimestamp}\]%{REQUIREDDATAFORRESPONSESTATUS:message}"		
			]
	#remove_field => [ "message" ]
	overwrite => ["message"]
}

I couldn't remove the message since I was using the message for filtering purposes.

So I had to throw in a if condition, for all the messages which didn't match and drop them.

if "<<<<< API Request" in [message] {
	drop { }
}

The above method somehow satisfied my need but then has been a pain now, since I'm having too much of ifs at the moment trying to match and drop the unwanted lines.

Am I going wrong somewhere? Or how can I make it more efficient?

Thanks again :slight_smile:


(Kulasangar Gowrisangar) #6

@magnusbaeck Is there any way I could go around this?


(system) #7

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.