Mail log event correlation

Apologies, I don't know whether this is the correct forum for this question, please bear with me.

I have an elasticsearch index containing mail logging events from syslog, however as with postfix the complete transaction for an email is difficult to determine from an individual event and must be correlated together using the queue id across multiple logging events to piece the whole story together

I'm used to doing this sort of log analysis manually with scripts I have developed, but I'm not sure even the terminology I could use to search for documentation on something like this in the elasticsearch ecosystem ... what should I be looking for?

we're using td-agent/fluent to do log shipping from our fleet to our elasticsearch nodes and if this is something that needs to happen prior to elasticsearch indexing and then kibana performing searches then we can look at that - my assumption is that we might want to pull the relevant raw records from the index, do $magic, and then push that back into the index later to search on

Any pointers appreciated, thanks

Welcome to our community! :smiley:

Do you want to roll all events for related IDs into a single thing?

Mark - thanks for the reply, yes that would be ideal. There is a bit of extra work to tack on the initial connection as that is logged prior to the queue_id being generated, but if we can collect events based on queue_id that would be good enough for the moment.

Check out Rolling up historical data | Elasticsearch Guide [7.14] | Elastic then, it might be what you want.

Mark - looks interesting, will see how far I get there. Can I set those up from within the Kibana UI or can I only poke at the elasticsearch API?

Mark - apologies, I found it and am working through that now, thanks for the assist

Mark - I think this might not be what I'm after ... the intent is to collect events by their queue_id but in a form that I can then search over rather than aggregate on, if I'm understanding Rollup properly - eg, in one line postfix logs the queue_id and the sender of a message, in another it will be the queue_id and the recipient(s) ... the only way to tie those together is to join them via the queue_id value, but I'm looking to perform later searches based on sender or recipient

No worries!

In that case keeping each event is the best step, then just do a general search on the ID. That way each step is an individual log event that you can visualise.

so really what you're suggesting is two searches - one to grab the queue_id on a known field (such as recipient) and then search again to grab all events by those queue_ids ... there's no way to create multi-event objects and search across those?

Ahh sorry, that's my fault.

Take a look at Tutorial: Transforming the eCommerce sample data | Elasticsearch Guide [7.14] | Elastic, the concepts there of pivoting on customer ID are the same as mail ID. Is that better?

nono, I don't know what words to use to describe the thing I'm after, but that does sound more like what I'm wanting ... will give that a shot, thanks for your patience

Yeah I get the challenge, I did stuff up with my original suggestions.

KQL should be able to that with chaining, ie queue_id: whatever | mail_id: whatever". But you still need an initial query to get the queue_id. I am not 100% sure there is something that can automatically tie those two together in one query.