i am not be able to avoid logstash continues proccesing events after grok patter matched. I just want to index the first occurence of the entire file. Here is what i have:
File content
first line content...
second line content...
third line should match EXTRACT_THIS
fourth line content...
fifth line not should match EXTRACT_THIS
break_on_match controls behaviour when you are matching a field against an array of patterns. If it is set to false then grok will only match the first entry in the array that matches, and ignore subsequent patterns. You only have a single pattern, so it has no effect.
I would suggest consuming the entire file as a single event. Then run grok against that, or you might need to use ruby to scan the [message] field and pick out the first match from the array of matches that scan returns.
when you are talking about array of patterns you mean something like:
filter {
grok {
break_on_match => false
match => { message => "(?<extraction1>(?<=EXTRACT_THIS).*)" }
match => { message => "(?<extraction2>(?<=EXTRACT_THIS).*)" }
match => { message => "(?<extraction3>(?<=EXTRACT_THIS).*)" }
}
}
Matching the message field against several patterns and if first pattern match and break_on_match => true the other two are not processed, do i understand it correctly?
So you suggest load the entire file in "message" and matching different parts with several grok patters, something line this:
Thank you Bagder, my problem now is that sometimes i need to macth the first occurrence and other times second or third occurence, is it posible to achieve this with using the array grok pattern?
first line content...
second line content...
EXTRACT_THIS 1234
fourth line content...
EXTRACT_THIS 5678
more content...
and more and more...
EXTRACT_THIS 456
This code extracts first occurrence, how can i extract the second one? or in case of many occurrences, how can i select the one i want? I want to capture the values, and values are changing in every file is entering on the repository (a folder in this case)
scan finds the three occurrences of Foo: followed by a number. For each occurence it returns an array of capture groups (parentheses inside the regexp). In my case there is only one capture group for each occurrence.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.