Exchange logs

Hello everyone !

I create this new topic today beacause I have an issue with message tracking in Exchange logs. Indeed I want to make a top10 senders for example but when I am looking to Exchange's logs there are many event-id just for 1 e-mail. Look at the following picture :

I sent 1 e-mail but there are 10 events. So if i look for the sender d.xxxx@xxx.xx filtering by sender-address.keyword, it will returns me that this user sent 10 e-mails, however it's only one.

Moreover, the number of event is not the same if it is an external / internal e-mail. Sometimes for notifications from any server the number could be 6 for example without "submit" event-id.

I think there is something i didn't understand about Exchange message tracking, maybe i can't get the exact number of e-mails using these logs.

Thanks for reading and have a nice week ! :slight_smile:

Hi- I am a Kibana developer, so I can't explain why your data looks that way, but I can help suggest a few options for understanding the data you have. If you're using one of the Beats that we provide, try asking in those forums instead.

It is pretty common for Elasticsearch data to have a unique key and then repeated values. Here are some options you have:

  1. In discover, you can't deduplicate, but you can filter only one message status. Would it work if you only showed the status RECEIVE?
  2. If you need to deduplicate the latest status per message, then I suggest creating an aggregated table. Configure your bucket to be Terms of message-subject, and your metric to be Top Hits of event-id sorted by Time. This will give you the latest status per unique message
  3. If you need to deduplicate often, then I think it would benefit you to change the way data is ingested. For example, you could install a tool like Logstash or Kafka in front of Elasticsearch to process messages.
1 Like

Hi, thank you for your answer !

First, yes I am using filebeat to recover logs from 2 Exchange servers. I would ask in the appropriate forums if I encounter another issue about it.

  1. In discover, you can't deduplicate, but you can filter only one message status. Would it work if you only showed the status RECEIVE?

I already tried this but it doesn't work. Indeed, if I put a filter to only keep RECEIVE status, the result is this :

As you can see, there are 2 RECEIVE status for the same e-mail. But, i noticed that there is always (I think) a difference between both Receive status. One of them is linked with SMTP source and the other with STOREDRIVER. So, I put an additional filter to keep only the event which has RECEIVE + SMTP because sometimes the STOREDRIVER source is missing.

  1. If you need to deduplicate the latest status per message, then I suggest creating an aggregated table. Configure your bucket to be Terms of message-subject, and your metric to be Top Hits of event-id sorted by Time. This will give you the latest status per unique message

This is a very good idea, I hadn't thought about it. However I didn't really understand how Top Hits work. I created a new vizualisation with the configuration you gave me but the result is very different from the solution with RECEIVE + SMTP filter.

  1. If you need to deduplicate often, then I think it would benefit you to change the way data is ingested. For example, you could install a tool like Logstash or Kafka in front of Elasticsearch to process messages.

I am using logstash too, but I didn't configure it. Maybe there is something wrong with it. I should take look at it and I will create a new topic if I find something suspicious.

Thanks for your help ! :grin: Currently the 1st option is working but I am looking closely, to see if there are any inconsistencies.

Great, if option 1 is working then you should continue using it.

Option 2 is using the top hits aggregation, which you could read the docs on to understand more. Like I said, it's a common pattern if you have a unique key, and I suggested a specific configuration assuming that your unique key is "message subject". Did you try what I suggested?

Option 3 is more of a vague suggestion, I didn't have a specific idea for how you could do it- just wanted to make you aware that sometimes it's a good idea to change how your data gets into ES

1 Like

Sorry for the response time, I had other work and had an issue on Eastick stack. As a result, I haven't been able to work on the Kibana interface lately.

Option 2 is using the top hits aggregation, which you could read the docs on to understand more. Like I said, it's a common pattern if you have a unique key , and I suggested a specific configuration assuming that your unique key is "message subject". Did you try what I suggested?

I tried this but I discover that something was wrong with the method which assume "message subject" as a unique key. Indeed, if I have an answer to one e-mail the subject is like this :

Message subject = RE: First Subject

So, this method will count 1 mail by merging all events which have the same subject, this is good because I had about 9 events for 1 e-mail in Kibana but if I get 3 responses from 3 different persons I will still get 1 e-mail because the subject dosen't change.

Anyway, I found another solution and it's the perfect one !

I felt silly that I hadn't noticed this before ! :man_facepalming:

In Exchange logs, there is 1 message ID per e-mail.
So, I realized that it's the same Message ID for all events (Submit, Receive, Deliver, etc.) of an e-mail.
Well, that's my unique key !!!

Thanks a lot for your help, I will put another answer with screenshots to show you the message ID and maybe it can help someone else in the future. Currently my Kibana is still down, so I don't know when I could post again about this topic.

Thanks again !