I'm parsing an IRC log and trying to assign different types based on which type of log entry a given line is.
Currently I'm using N different matches in a single grok filter, since a given line should match at most one of them, and then doing conditional mutates after. This feels like I'm doing something wrong.
I could break them up into N different grok calls each setting the type, but then I end up getting valid lines tagged as failures since each line will fail to match at least one of the expressions. I could override those to not set the failure tag, but it feels like I'm missing something going down this path.
I've tried adding add_field after each match, but it then adds all the fields instead of just the one from the previous match.
Version:
iMac:logstash james$ logstash --version
logstash 2.4.0
Test input:
iMac:logstash james$ cat test.log
--- Log opened Tue Sep 06 00:00:04 2016
--- Day changed Tue Sep 06 2016
00:00:04+0200 <+User1> Some message here
00:13:52+0200 -!- User2 [User2@some.site] has joined #somechannel
00:05:33+0200 -!- User3 [User3@some.other.site] has left #somechannel []
08:46:06+0200 * User4 does some action
Test configuration:
iMac:logstash james$ cat test.conf
input {
stdin { }
}
filter {
grok {
# normal chat message entry
match => { "message" => "%{HOUR}:%{MINUTE}:%{SECOND}%{ISO8601_TIMEZONE}%{SPACE}<[@ +*]%{SPACE}(?<user>[^>]+)>%{SPACE}(?<messageText>.*)" }
# user did /me action
match => { "message" => "%{HOUR}:%{MINUTE}:%{SECOND}%{ISO8601_TIMEZONE}%{SPACE}\* (?<user>[^ ]+)%{SPACE}(?<action>.*)" }
# user joined entry
match => { "message" => "%{HOUR}:%{MINUTE}:%{SECOND}%{ISO8601_TIMEZONE}%{SPACE}-!-%{SPACE}(?<user>[^ ]+) \[.*\] has joined .*" }
# user left entry
match => { "message" => "%{HOUR}:%{MINUTE}:%{SECOND}%{ISO8601_TIMEZONE}%{SPACE}-!-%{SPACE}(?<user>[^ ]+) \[.*\] has left .*" }
}
if [messageText] =~ "." {
mutate { replace => { "type" => "chatMessage" } }
} else if [action] =~ "." {
mutate { replace => { "type" => "action" } }
} else if [message] =~ "has joined" {
mutate { replace => { "type" => "joined" } }
} else if [message] =~ "has left" {
mutate { replace => { "type" => "left" } }
}
if "_grokparsefailure" in [tags] {
drop {}
}
}
output {
stdout { codec => rubydebug }
}
Output from running shows it works fine, but that set of conditional mutate calls feels like I'm missing the Right Way(tm) to be doing this.
iMac:logstash james$ logstash -f test.conf < test.log
Settings: Default pipeline workers: 4
Pipeline main started
{
"message" => "00:00:04+0200 <+User1> Some message here",
"@version" => "1",
"@timestamp" => "2016-09-13T15:06:31.485Z",
"host" => "iMac.local",
"user" => "User1",
"messageText" => "Some message here",
"type" => "chatMessage"
}
{
"message" => "00:13:52+0200 -!- User2 [User2@some.site] has joined #somechannel",
"@version" => "1",
"@timestamp" => "2016-09-13T15:06:31.485Z",
"host" => "iMac.local",
"user" => "User2",
"type" => "joined"
}
{
"message" => "00:05:33+0200 -!- User3 [User3@some.other.site] has left #somechannel []",
"@version" => "1",
"@timestamp" => "2016-09-13T15:06:31.485Z",
"host" => "iMac.local",
"user" => "User3",
"type" => "left"
}
{
"message" => "08:46:06+0200 * User4 does some action",
"@version" => "1",
"@timestamp" => "2016-09-13T15:06:31.485Z",
"host" => "iMac.local",
"user" => "User4",
"action" => "does some action",
"type" => "action"
}
Pipeline main has been shutdown
stopping pipeline {:id=>"main"}
iMac:logstash james$