Number of grok filters per logstash.conf file

Hello everybody. I am trying to parse logs using multiple grok filters. I have noticed that I can use 8 filters so far, each one with its own pattern, like this:

 grok{
		tag_on_failure => []
		pattern_definitions => {"text2" => "([,][ ][a-zA-z ?]+)"}
		match => {"message" => "%{text2:response2}"}
	  }

However when I add another one to the same logstash.conf file, I get logstash exited with code 0.

Is there a limitation on the number of grok filters you can use?

I don't think that there is a limitation, what is in the logstash log when it exit? It could be something wrong in the configuration.

Can you share your full logstash config that is not working?

Wow, that was fast! thank you very much for taking an interest in my question.

This is what I'm seeing in the logs ( I am using docker-compose to start the ELK stack on my machine):

logstash         | [2020-11-10T13:10:38,859][INFO ][logstash.outputs.elasticsearchmonitoring][.monitoring-logstash] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearchMonitoring", :hosts=>["http://elasticsearch:9200"]}
logstash         | [2020-11-10T13:10:38,881][WARN ][logstash.javapipeline    ][.monitoring-logstash] 'pipeline.ordered' is enabled and is likely less efficient, consider disabling if preserving event order is not necessary
logstash         | [2020-11-10T13:10:39,116][INFO ][logstash.javapipeline    ][.monitoring-logstash] Starting pipeline {:pipeline_id=>".monitoring-logstash", "pipeline.workers"=>1, "pipeline.batch.size"=>2, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>2, "pipeline.sources"=>["monitoring pipeline"], :thread=>"#<Thread:0x4ff749e7 run>"}
logstash         | [2020-11-10T13:10:40,256][INFO ][logstash.javapipeline    ][.monitoring-logstash] Pipeline Java execution initialization time {"seconds"=>1.14}
logstash         | [2020-11-10T13:10:40,309][INFO ][logstash.javapipeline    ][.monitoring-logstash] Pipeline started {"pipeline.id"=>".monitoring-logstash"}
logstash         | [2020-11-10T13:10:40,718][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
logstash         | [2020-11-10T13:10:42,588][INFO ][logstash.javapipeline    ] Pipeline terminated {"pipeline.id"=>".monitoring-logstash"}
logstash         | [2020-11-10T13:10:42,640][INFO ][logstash.runner          ] Logstash shut down.

There is not enough information on these logs, but it seems that logstash is not starting any pipelines, only the monitoring pipeline.

Share your configs, your pipelines, your logstash config and your docker-compose config.

sure thing. This is logstash.conf:

input {
  tcp {
      port => 4560
      codec => json_lines
  }
}

filter {
    
	  grok {
		tag_on_failure => []
		pattern_definitions => {"event" => "(?<=Resolving event: )(.*)"}
		match => { "message" => "%{event:user_event}" }
	  }	
	  
	  grok {
		tag_on_failure => []
		pattern_definitions => {"query" => "(?<=Resolving query: )(.*)"}
		match => { "message" => "%{query:user_query}" }
	  }	
	  
	  grok {
		tag_on_failure => []
		pattern_definitions => {"session" => "(?<=Session: ).+?(?= -)"}
		match => { "message" => "%{session:session_id}"}
	  }
	  
	  grok {
		tag_on_failure => []
		pattern_definitions => {"requestId" => "(?<=RequestId: ).+?(?= -)"}
		match => { "message" => "%{requestId:request_id} " }
	  }
	  
	  grok {
		tag_on_failure => ["no full response"]
		pattern_definitions => {"full_response" => "(?<=responses=)(.*)(?=extraData)"}
		match => { "message" => "%{full_response:full_response} " }
	  }	
	  grok{
		tag_on_failure => []
		pattern_definitions => {"intent" => "(?<=intent=)(.+?)(?=,)|(?<=intent: )(.*)"}
		match => {"message" => "%{intent:intent}"}
	  }
	  grok{
		tag_on_failure => []
		pattern_definitions => {"text1" => "((?<=responses=)[\w ?]+(?=,))"}
		match => {"message" => "%{text1:response1}"}
	  }
	  grok{
		tag_on_failure => []
		pattern_definitions => {"text2" => "([,][ ][a-zA-z ?]+)"}
		match => {"message" => "%{text2:response2}"}
	  }
	  grok{
	       tag_on_failure = []
	       pattern_definitions => {"text3" => "(?<="title" :)(.+?)(?=,)"}
               match => {"message" => "%{text3:response3}"} 	  	
	  }

	  if("" in [response1]){
		if("" in [response2]){
			 mutate{
				add_field => {
				      "response_text" => "%{response1} %{response2}"
				}
				remove_field => ["response1", "response2"]
	  		}
			
		} else {
			mutate{
				add_field => {
				      "response_text" => "%{response1}"
				}
				remove_field => ["response1", "response2"]
	  		}
		}
	 }
	  
    }

output {
  elasticsearch {
    hosts => "http://elasticsearch:9200"
	}
}

ok, I found a typo on the last grok section, I was missing the >, however I am still getting an error message:

logstash         | [2020-11-10T13:38:33,812][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"{\", \"}\" at line 56, column 51 (byte 1534) after filter {\r\n    \r\n\t  grok {\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"event\" => \"(?<=Resolving event: )(.*)\"}\r\n\t\tmatch => { \"message\" => \"%{event:user_event}\" }\r\n\t  }\t\r\n\t  \r\n\t  grok {\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"query\" => \"(?<=Resolving query: )(.*)\"}\r\n\t\tmatch => { \"message\" => \"%{query:user_query}\" }\r\n\t  }\t\r\n\t  \r\n\t  grok {\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"session\" => \"(?<=Session: ).+?(?= -)\"}\r\n\t\tmatch => { \"message\" => \"%{session:session_id}\"}\r\n\t  }\r\n\t  \r\n\t  grok {\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"requestId\" => \"(?<=RequestId: ).+?(?= -)\"}\r\n\t\tmatch => { \"message\" => \"%{requestId:request_id} \" }\r\n\t  }\r\n\t  \r\n\t  grok {\r\n\t\ttag_on_failure => [\"no full response\"]\r\n\t\tpattern_definitions => {\"full_response\" => \"(?<=responses=)(.*)(?=extraData)\"}\r\n\t\tmatch => { \"message\" => \"%{full_response:full_response} \" }\r\n\t  }\t\r\n\t  grok{\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"intent\" => \"(?<=intent=)(.+?)(?=,)|(?<=intent: )(.*)\"}\r\n\t\tmatch => {\"message\" => \"%{intent:intent}\"}\r\n\t  }\r\n\t  grok{\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"text1\" => \"((?<=responses=)[\\w ?]+(?=,))\"}\r\n\t\tmatch => {\"message\" => \"%{text1:response1}\"}\r\n\t  }\r\n\t  grok{\r\n\t\ttag_on_failure => []\r\n\t\tpattern_definitions => {\"text2\" => \"([,][ ][a-zA-z ?]+)\"}\r\n\t\tmatch => {\"message\" => \"%{text2:response2}\"}\r\n\t  }\r\n\t  grok{\r\n\t       tag_on_failure => []\r\n\t       pattern_definitions => {\"text3\" => \"((?<=\"", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:32:in `compile_imperative'", "org/logstash/execution/AbstractPipelineExt.java:183:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:69:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:44:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:357:in `block in converge_state'"]}

But line 56 is just

 pattern_definitions => {"text3" => "((?<="title" :)(.+?)(?=,))"}

Hello, looks like the double quotes character has to be escaped if you want to use it inside a regex in logstash.conf:

pattern_definitions => {"text3" => "(?<=title\" :)(.+?)(?=,)"}

If you use it inside the logstash pipeline config, you need to escape the double quotes, but if you put your patterns in an external files and use patterns_dir instead of patterns_definitions, I think that there is no need to escape it, but you need to test to confirm.

Looking at your pipeline I would suggest that you put all your custom patterns in a file inside a directory and use this directory in patterns_dir.

You would have a file with your patterns.

event (?<=Resolving event: )(.*)
query (?<=Resolving query: )(.*)
session (?<=Session: ).+?(?= -)
requestId (?<=RequestId: ).+?(?= -)
full_response (?<=responses=)(.*)(?=extraData)
intent (?<=intent=)(.+?)(?=,)|(?<=intent: )(.*)
text1  ((?<=responses=)[\w ?]+(?=,))
text2  ([,][ ][a-zA-z ?]+)
text3  (?<=title" :)(.+?)(?=,)

And use patterns_dir in your config

patterns_dir => ["/dir/with/the/patterns/file"]

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.