Logstash and Grok Matching

Jasonespo · June 3, 2019, 2:09pm

Hi

My question is regarding Logstash and Grok.

If I have two filters in two different files

filter {
if [message] =~ Regex
grok {
match => [PATTERNA]
}

filter {
if [message] =~ Regex
grok {
match => [PATTERNB]
}

If a messages matches both the regex for filter 1 and filter 2. But it fails to match the grok on PATTERNA, will it fail and exit here, or will it try PATTERNB in the second file?

pastechecker · June 3, 2019, 3:40pm

Your question is convoluted and I believe dont understand it properly.

Why not to have one config file with two different regex patterns?
Why you need two config files for ingesting the same data?

Jasonespo · June 3, 2019, 4:19pm

Well in terms of performance what is better?

A single config file with 100 grok filters to match against.

Or four with 25 grok filters each?

pastechecker · June 4, 2019, 7:59am

Well, you can use the break_on_match clause in grok to stop processing your regular expressions in case of the 100% match. https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html

If you do not have a insane EPS just do this with one config, make sure that you have proper anchoring ^ and $ in your patterns to do the efficient matching. You can sort your match expressions in the ordered list that the most matching ones will be placed on the top etc.

If you have insane amount of events per second you will get better performance by distributing your input.

Jasonespo · June 5, 2019, 8:21am

Hi,

Thanks for the response. We have high EPS, so I think that it's the best methodology to split the filters as we are currently doing anyways.

The difficulty arises when regex matches two types of log message.

For example Syslog regex, could match against a Firewall, but also against an Ubuntu box.

Here we will need to split them into the two filters. I guess I just need to invest some time into working out some complicated logic.

Regards,

Jason

pastechecker · June 5, 2019, 8:30am

You can try the following idea for pipeline to pipeline than:

input {
	file {
	        path => "/your/file/log1.log"
	        start_position => "beginning"
	        codec => json
	}
}

filter {
	grok {
		match { "message" => }#do some quick regex prefiltering in order to detect what is comming from where and distribute later
	}
}

output {
	if [log1_field]{
		pipeline { 
			id => "YOUR_LOG1_PROCESSING_PIPELINE"
			send_to => LOG1_PROCESSING
		}
	} else if [log2_field]{
		pipeline {
			id => "YOUR_LOG2_PROCESSING_PIPELINE"
			send_to => LOG2_PROCESSING
		}
	} else {
		pipeline {
			id => "YOUR_LOG3_PROCESSING_PIPELINE"
			send_to => LOG3_PROCESSING
		}
	}
}

You do the very light regex expressions in the distributor and your heavy processing on separate pipelines. I do not know any other idea than this or running kawka and multiple logstash nodes reading from the same topics and doing that loadbalancing.

Jasonespo · June 5, 2019, 8:36am

@pastechecker

Where did you place that configuration?

In the pipelines.yml file?

pastechecker · June 5, 2019, 9:47am

No.
pipelines.yml is the file that contains the information where you should load your config from, how many workers, batch sizes etc.

In your example you would have to have something similar to:

pipelines.yml:

- pipeline.id: distributor_pipeline.conf
  path.config: "/etc/logstash/conf.d/distributor_pipeline.conf"
- pipeline.id: file_processing_1.conf
  path.config: "/etc/logstash/conf.d/file_processing_1.conf"
  - pipeline.id: file_processing_2.conf
  path.config: "/etc/logstash/conf.d/file_processing_2.conf"
  - pipeline.id: file_processing_3.conf
  path.config: "/etc/logstash/conf.d/file_processing_3.conf"

Your distributor pipeline would be the config pasted above named as distributor_pipeline.conf

The beginning of the file_processing_* configs would be:

input { 
	pipeline { 
		address => LOG1_PROCESSING 
	} 
}
filter {
#do your detailed stuff here
}

output {
#output wherever you want
}

Is it clearer now?

Jasonespo · June 5, 2019, 9:50am

Ignore.

pastechecker · June 5, 2019, 9:51am

Look above.
Also a good reading: https://www.elastic.co/guide/en/logstash/current/pipeline-to-pipeline.html

Jasonespo · June 5, 2019, 9:53am

@pastechecker

Sorry. Thanks, that makes much more sense..

I'll give it a go

system · July 3, 2019, 9:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiple Grok pattern filters arent filtering multiple logs in one logstash file Logstash	8	3594	February 13, 2020
Conditional grok Logstash	8	33096	July 6, 2017
Grok performance Logstash	5	1308	January 18, 2018
Matching multiple Grok patterns in a single Logstash config file Logstash	2	3512	May 30, 2017
Want to match against single grok pattern and multiple patterns in same filter Logstash	1	257	September 23, 2020

Logstash and Grok Matching

Related topics