Elasticsearch terms aggregation and querrying

Hi I have two types of log messages:

Jul 23 09:24:16 rrr mrr-core[222]: Aweg3AOMTs_1563866656871111.mt processMTMessage() #12798 realtime: 5.684 ms

Jul 23 09:24:18 rrr mrr-core[2222]: Aweg3AOMTs_1563866656871111.0.dn processDN() #7750 realtime: 1.382 ms

First message is kind of sent message and second is message which confirm that message was delivered.

The difference between them is the suffix which I have separated from "id" and can query it.

These messages are parsed and stored in elasticsearch in following format:

messageId: Aweg3AOMTs_1563866656871111.0.dn
text: Aweg3AOMTs
num1: 1563866656871111
num2: 0
suffix: mt/dn

I would like to find out which messages were succesfully delivered and which weren't. I am very begginer in elasticsearch so I'm really struggling.

I'm trying terms aggregations at the moment but all I could've achived is this code:

GET /my_index3/_search
  "size": 0,
  "aggs": {
    "num1": {
      "terms": {
        "field": "messageId.keyword",
        "include": ".*mt*."

Which shows me the sent messages. I don't know how to add some filter there or clause that could show me only messages having both mt and dn suffix.

If anyone has an idea I'd be really thankfull :))

Assuming there's a large number of unique messageIDs this is one of those tricky problems to do for any distributed data store.
You'll likely need to maintain an entity-centric index keyed on the message ID rather than attempting this analysis on a purely log-centric index.

Here's a link to why entity centric indexes are sometimes required. It includes some example scripts to build an entity-centric index but we also now have the dataframes feature in 7.2 which can also fuse related data around an ID.