I'm looking to grok mails which were not sent. So i would like the entire file to be read as one log with 5 fields : From, To, Date, Subject and the rest as Message.
Here is what it looks like:
From: TEXT
To: TEXT
Date: Thu, 30 Jul 2020 19:00:25 +0200
TEXT
TEXT
TEXT
TEXT
TEXT
Subject: TEXT
TEXT
TEXT
TEXT
etc...
How can i write my grok to get those FROM field, then TO field then DATE field, then between DATE and SUBJECT, we put everything in MESSAGE, then SUBJECT then again in MESSAGE
I'm using a pipeline which has 10 filters depending on a field i add in filebeat.yml. So my first question is : can i use codec multiline with a condition in the input of my pipeline ?
Right now it looks like
beats {
port => 5047
client_inactivity_timeout => 1200
}
}
To use multiline only with this log_type, i don't know if i can put \n or else.
I've looked to a lot of example or others post on this website and others, but i didn't find the solution yet , i m still testing all the filters either in filebeat or in logstash.
Thanks already for your time and your help, have a great day,
Ok thanks for your answer @Badger, so i have to put the multiline.pattern and others settings into my filebeat.yml.
But i can't find any patterns matching what i need. I can get the line where there are the terms "From: To: Subject: and Date:" but i don't know how to write "take the rest of the document into a Message field". The number of line and the content will be different for every mail.
Moreover, the mail can be a response to another mail.. so it can be fullfilled with:
From:
To:
Date:
TExt ....
Subject:
text
text
...
...
from:
to:
etc....
Those fields can even be at the same level of indentation... but i only want to get "From: To: Subject: and Date:" fields of the top of the mail. I can understand and i don't want u to do the regex filter, but i only want to know if it's possible
From:
To:
Date:
TExt ....
Subject:
text
text
From:
To:
Date:
TExt ....
Subject:
text
text
that the second From is sometimes a new mail message and sometimes a quoted message then no, I see no way for logstash or filebeat to determine which it is.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.