Hello everyone,
I have to handle big log-files (arround 50k lines) using ELK and want to extract some information out of it.
A long story short, I want to use filter for searching for specific informations, store those infos and drop the rest. Therefore i used a logstash configuration like this:
input {
file{
type => "test"
path => "/usr/share/logstash/input/*/log"
start_position => "beginning"
codec => multiline {
pattern => "Finished: (SUCCESS|FAILURE)"
negate => "true"
what => "next"
max_lines => 16000
max_bytes => "30MiB"
}
}
}
filter {
grok {
patterns_dir => ["/usr/share/logstash/patterns"]
match => { "message" => "Finished: %{WORD:build_state}" }
match => { "message" => ".*checkout in %{NODE_NAME:nodes}.*" }
}
mutate {
remove_field => ["message"]
}
}
output {
elasticsearch {
hosts => "elasticsearch"
}
}
The idea is: every file ends with either "Finished: SUCCESS" or "Finished: FAILURE", so I add every line before to a multiline event. Afterwards I match some informations like, id, cluster node... It all works fine for small files, so the configuration above works perfectly for me and files smaller than 16k lines, however when I increase the max_lines to 32k for example it does not match the pattern ".checkout in %{NODE_NAME:nodes}." anymore.
What happens there how can I fix that issue? Is it possible to filter the input like "just read lines matching ..." to decrease data amount?
Kind regards
Philip