Hello,
I'm working on exim email logs and I'm stuck with this codec.
here is an example from the log:
2017-04-03 02:19:58 H=(localhost) [117.0.54.236] F=test@earatt.net rejected RCPT smityrd@example.com: Rejected message because 117.0.54.236 is in a black list at huzkzg6n5flrulopcolvmnfhty.zen.dq.spamhaus.net
2017-04-03 02:19:58 unexpected disconnection while reading SMTP command from (localhost) [117.0.54.236] (error: Connection reset by peer)
2017-04-03 02:19:58 dovecot_login authenticator failed for (ylmf-pc) [104.247.196.7]: 535 Incorrect authentication data (set_id=@example.com)
2017-04-03 02:19:58 no host name found for IP address 58.187.167.240
2017-04-03 02:19:58 1cuy9W-000F1j-JS DKIM: d=stratr.com s=k1 c=relaxed/relaxed a=rsa-sha1 b=1024 i=noreply@stratfor.com [verification succeeded]
2017-04-03 02:19:58 1cuy9W-000F1j-JS <= bounce-mc.us4_7958185.306033-test=example.com@mail10.4.rsgsv.net H=mail10.atl11.rsgsv.net [205.201.133.10] P=esmtp S=21651 id=7478.rsgsv.net T="How Japan Got Baseball"
2017-04-03 02:19:58 no host name found for IP address 46.29.251.135
2017-04-03 02:19:58 1cuy9W-000F1j-JS => test test@example.com R=mysql_user T=mysql_delivery
2017-04-03 02:19:58 1cuy9W-000F1j-JS Completed
I was trying to use following logstash config:
filter {
if [type] == "eximlog" {
mutate {
add_field => {
"message_1" => "%{message}"
}
}
multiline {
patterns_dir => "/etc/logstash/patterns/"
pattern => "%{DATE} %{TIME} %{HOSTNAME:exim_msg_id} (=>|Completed)"
negate => false
what => "previous"
}
grok {
patterns_dir => "/etc/logstash/patterns/"
break_on_match => false
match => [
"message_1", "%{DATE} %{TIME} %{HOSTNAME:exim_msg_id} %{GREEDYDATA}"
]
match => [ 'message', '%{EXIM_ALL_RULES}']
}
remove fields
mutate {
remove_field => [ 'host', 'offset' ]
}
Remove the really, really dirty hack to workaround bug in grok code
which won't handle multiple matches on the same field
mutate {
remove_field => [ "message_1"]
}
}
}
Each message might have at least 3 lines that only common value is date,message_id.
The problem I stumble on is that there are other log entries unrelated to that message but they still getting included. For example in message above there is additional entry "2017-04-03 02:19:58 no hostname found for IP address 46.29.251.135" between 2017-04-03 02:19:58 1cuy9W-000F1j-JS related rows.
Please advise.