Elasticsearch terms aggregation and querrying

Hi I have two types of log messages:

Jul 23 09:24:16 rrr mrr-core[222]: Aweg3AOMTs_1563866656871111.mt processMTMessage() #12798 realtime: 5.684 ms

Jul 23 09:24:18 rrr mrr-core[2222]: Aweg3AOMTs_1563866656871111.0.dn processDN() #7750 realtime: 1.382 ms

First message is kind of sent message and second is message which confirm that message was delivered.

The difference between them is the suffix which I have separated from "id" and can query it.

These messages are parsed and stored in elasticsearch in following format:

messageId: Aweg3AOMTs_1563866656871111.0.dn
text: Aweg3AOMTs
num1: 1563866656871111
num2: 0
suffix: mt/dn

I would like to find out which messages were succesfully delivered and which weren't. I am very begginer in elasticsearch so I'm really struggling.

I'm trying terms aggregations at the moment but all I could've achived is this code:

GET /my_index3/_search
{
  "size": 0,
  "aggs": {
    "num1": {
      "terms": {
        "field": "messageId.keyword",
        "include": ".*mt*."
      }
    }
  } 
}

Which shows me the sent messages. I don't know how to add some filter there or clause that could show me only messages having both mt and dn suffix.

If anyone has an idea I'd be really thankfull :))

Assuming there's a large number of unique messageIDs this is one of those tricky problems to do for any distributed data store.
You'll likely need to maintain an entity-centric index keyed on the message ID rather than attempting this analysis on a purely log-centric index.

Here's a link to why entity centric indexes are sometimes required. It includes some example scripts to build an entity-centric index but we also now have the dataframes feature in 7.2 which can also fuse related data around an ID.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.