Help to refine a grok parser

Hi all,

Logstash v2.2.2

I'm not that strong with grok and regex's, but have managed to create a working grok parser as follows:

Log excerpt:
2016-02-25 19:06:23 msg 3/3 (4575 bytes) msgid 0000178e55c5dc45 from <observium-bounces@observium.org> delivered to MDA_external command procmail (), deleted

Grok parser (part of logstash filter):
%{TIMESTAMP_ISO8601:syslog_timestamp} msg {1,2}%{NUMBER:mess_num}\/%{NUMBER:mess_count} \(%{NUMBER:mess_bytes} bytes\) msgid %{MSGID:mess_msgid} from %{EMAILADDRESS:mess_from} %{GREEDYDATA:syslog_message}

Patterns used:
MSGID [a-zA-Z0-9_.+-=:]+ EMAILADDRESSPART [a-zA-Z0-9_.+-=:]+ EMAILADDRESS \<%{EMAILADDRESSPART:email_local}@%{EMAILADDRESSPART:email_remote}\>

If anybody has the time I'd like to know if/how this could be improved please just to aid my learning.

Many thanks,

--
Roland

Looks reasonable. A few comments:

  • I wouldn't consider the angle brackets to be part of the email address.
  • EMAILADDRESSPART will definitely work for most email addresses, but not all. How strict does this expression need to be, really? I would've just used something like this:
 ... from <(?<mess_from>[^>]+)>

Thanks for the feedback Magnus.

I'll play around with your suggestions. In terms of the email address extraction, I'm not set on how this should be done and copied something I saw elsewhere to get what I currently have.

My system is just a home one which I use for teaching myself stuff, so my needs change as I learn :-).

Thanks again.