Handling queue ids vs message ids

hey mates,

maybe some of you could help me with a problem i'm acutally trying to solve.

we're using logstash in combination with logstash-forwarder and elasticsearch to store the data.

we're trying to analyze our internal mail tracking. so we've got about
5 different kinds of log types and so we have 5 different queue ids
(each system gives the incoming mail a different id) for one message id
which is equal over the different systems. so e.g. we have 10 entries
with the queue id 1234 but only one entry with the queue id 1234 and the
message id xyz. but we need to combine / match / map every of these
entries with the correct message id over all of these different log
types.

our problem now is to match the data correct to our dashboard.

i would be deeply grateful if some of you guys could show us a solution to solve this problem

-cheers

What type of matching are you looking for, specifically? And by dashboard, are you talking about Kibana?

here are some examples of our logs we're looking for. some kinds of postfix stuff and some self built logs

first step of a mail (postfix server):

Jun 25 09:20:29 progov33 postfix/smtpd[18999]: D920718033: client=pcm20w8.procilon.local[10.3.45.20]
Jun 25 09:20:29 progov33 postfix/cleanup[19004]: D920718033: message-id=558BABBD.7010601@procilon.de
Jun 25 09:20:29 progov33 postfix/qmgr[3759]: D920718033: from=steffen.renker@procilon.de, size=28828, nrcpt=1 (queue active)
Jun 25 09:20:30 progov33 postfix/pipe[19005]: D920718033: to=steffen.renker@procilon.de, relay=julia-g, delay=0.13, delays=0.07/0.01/0/0.06, dsn=2.0.0, status=sent (delivered via julia-g service)
Jun 25 09:20:30 progov33 postfix/qmgr[3759]: D920718033: removed

second step (other server, self built logs)

[[25/06/2015 09:20:29] 8346eaf11444]: Input mail size is 28411 bytes.
[[25/06/2015 09:20:29] 8346eaf11444]: Envelope Sender is: steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: Message-ID: 558BABBD.7010601@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: Recipient steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: Command line Recipients (for keysearch, only SMTP-Envelope):
[[25/06/2015 09:20:29] 8346eaf11444]: Recipient steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: All email addresses for keysearch (may include sender):
[[25/06/2015 09:20:29] 8346eaf11444]: Recipient steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: Connected SMTP client (-client) -127.0.0.2-
[[25/06/2015 09:20:29] 8346eaf11444]: Host 127.0.0.2 is not allowed to use the mailoffice.
[[25/06/2015 09:20:29] 8346eaf11444]: K-FALL is not active.
[[25/06/2015 09:20:29] 8346eaf11444]: command is empty
[[25/06/2015 09:20:29] 8346eaf11444]: Using SINGLE SMIME engine.
[[25/06/2015 09:20:29] 8346eaf11444]: This message is not in S/MIME format
[[25/06/2015 09:20:29] 8346eaf11444]: This message is not in CMS format
[[25/06/2015 09:20:29] 8346eaf11444]: S/MIME not detected.
[[25/06/2015 09:20:29] 8346eaf11444]: This message is not enveloped PKCS#7, so i can't check the signature.
[[25/06/2015 09:20:29] 8346eaf11444]: S/MIME Encrypted : 0 , S/MIME Signed : 0
[[25/06/2015 09:20:29] 8346eaf11444]: no recipients in list.
[[25/06/2015 09:20:29] 8346eaf11444]: To line is: steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: CC line is: undefined
[[25/06/2015 09:20:29] 8346eaf11444]: CC header line not rewritten.
[[25/06/2015 09:20:29] 8346eaf11444]: Now pipe mail to virus scanner (nexthop system).
[[25/06/2015 09:20:29] 8346eaf11444]: Deliver Mail to mailer /opt/julia/bin/sendmail -i with From: steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: Pipe Mail to sendmail with From: steffen.renker@procilon.de
[[25/06/2015 09:20:29] 8346eaf11444]: Full Mailer: /opt/julia/bin/sendmail -i -fsteffen.renker@procilon.de steffen.renker@procilon.de
[[25/06/2015 09:20:30] 8346eaf11444]: Mail successfully piped to virus scanner (nexthop system).

third step (back to the postfix server):

Jun 25 09:20:30 progov33 postfix/pickup[17460]: 00B6F18039: uid=1000 from=steffen.renker@procilon.de
Jun 25 09:20:30 progov33 postfix/cleanup[19004]: 00B6F18039: message-id=558BABBD.7010601@procilon.de
Jun 25 09:20:30 progov33 postfix/qmgr[3759]: 00B6F18039: from=steffen.renker@procilon.de, size=28950, nrcpt=1 (queue active)
Jun 25 09:20:30 progov33 postfix/smtp[19009]: 00B6F18039: to=steffen.renker@procilon.de, relay=10.3.140.141[10.3.140.141]:10025, delay=0.37, delays=0.04/0.01/0.04/0.28, dsn=2.0.0, status=sent (250 OK)
Jun 25 09:20:30 progov33 postfix/qmgr[3759]: 00B6F18039: removed

forth step (other server, self built logs):

2015-06-25 09:20:30 INFO [SMTPProcessor-1] [SMTPProcessor ] --> Eingehende Verbindung via SMTP von 10.3.140.141
2015-06-25 09:20:30 INFO [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 FROM: steffen.renker@procilon.de
2015-06-25 09:20:30 INFO [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 RCPT TO: steffen.renker@procilon.de
2015-06-25 09:20:30 INFO [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 SUBJECT: signiertes PDF (Name passt)
2015-06-25 09:20:30 INFO [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 MIMEID: 558BABBD.7010601@procilon.de
2015-06-25 09:20:30 INFO [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 SIZE: 28274
2015-06-25 09:20:30 DEBUG [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 Speichere Mail in der Datenbank
2015-06-25 09:20:30 DEBUG [SMTPProcessor-1] [SMTPProcessor ] F36E4567E7684C518CECFCECF08E3738 Nachricht mit id 4202 gespeichert
2015-06-25 09:20:30 DEBUG [SMTPProcessor-1] [SMTPMessage ] F36E4567E7684C518CECFCECF08E3738 Verifiziere Mail in der Datenbank
2015-06-25 09:20:30 DEBUG [SMTPProcessor-1] [QueueUtils ] sendMessageDatasToQueue F36E4567E7684C518CECFCECF08E3738 >>>
2015-06-25 09:20:30 DEBUG [SMTPProcessor-1] [QueueUtils ] <<< sendMessageDatasToQueue

as you can see, there is a unique message id 558BABBD.7010601@procilon.de over all these different types, but i need to visualize all the data matching to this is, so i have to look up for all the different queue ids D920718033, 8346eaf11444, 00B6F18039, F36E4567E7684C518CECFCECF08E3738.

we're using a self built dashboard which is kinda like kibana.

any ideas? :smile:

do you see a possibility to solve this problem with logstash or with our requests for elasticsearch? my thought is to search for all message ids and note the queue id and with a second request print all the matching queue ids

my thought now is to solve this problem with the multiline filter, but i have no clue how to code it :frowning:

Multiline codec? But I don't see any multiline messages in the samples you've posted.

you're right, there are no multiline messages in ours logs, but my thougt was, to check all lines for a unique queueid and add a field to each line where no message-id field is given. to visualize our mailtracking, we must have the message-id for each matching entry where the unique queue-id is. do you see any solution to solve this in logstash or elasticsearch? or better to write another program which solves this problem?

You can use Logstash and Elasticsearch, but not them alone. What you can do is store email tracking information in ES and use Logstash to run a program for processing log entries as they arrive and create/update the tracking information in ES.

this is what we actually do. shipping logs with logstash-forwarder from our different systems to a central logstash server, parse the different types with different grok pattern and then store them in elasticsearch.

our normal way of mail is: postfix (in) -> julia (cryptographical actions) -> progov (cryptographical actions) -> postfix (out)

the log entrys are like

  1. Postfix:
    qid 123 mailFrom ...
    qid 123 mailTo...
    qid 123 Subject ...
    qid 123 MessageId ABC
    qid 123 ...

  2. Julia:
    qid 234 mailFrom...
    qid 234 mailTo...
    qid 234 Subject...
    qid 234 MessageId ABC
    qid 234 ...

  3. Progov:
    qid 345 mailFrom..
    qid 345 mailTo...
    qid 345 Subject ...
    qid 345 MessageId ABC
    qid 345 ....

  4. Postfix:
    qid 456 mailFrom
    qid 456 mailTo..
    qid 456 Subject..
    qid 456 MessageId ABC
    qid 456 ....

as you can see, there is only the messageid ABC over all the different systems the same. but we need to have the message-id in alle the same qid entrys like

qid 123 messageid abc mailFrom mailTo Subject ....

our last thinking for visualizing the mailtracking was (search in elasticsearch)

  1. search for a specific messageid
  2. show all entrys with the qid of the messageid
  3. built the matching graphs with the input of the qid entrys

could this be the right way or maybe built a script which updates the data in elasticsearch like shown above.

Well, I guess you have two options:

  • Save all entries as-is and make multiple lookups to find all data; one per system to map the message id to the system-specific queue id, and then one lookup per system to get all information for a queue id.
  • Process log entries as they arrive and update the only ES document for every message, effectively refining the document as the message progresses through the system.

The latter might require more work upfront but will definitely perform better when it's in place.

(Keep in mind that Postfix's queue ids aren't unique over time.)