Query related logs that contain a keyword

I have a need to query the logs in Elasticsearch through Kibana in a certain way that I will explain soon. I'll try my best to explain what I'm looking for and hopefully someone can tell me if it is possible via Kibana or whether I should just query elastic directly.

I log incoming requests to my system and whatever follows up to the point where it has to return a response. All those logs are grouped by a correlation ID. So with a specific correlation ID I can see the incoming request, any potential logged errors, any further propagated logs and then the response message.

So let's say I have a DEBUG log (with a correlation ID) that contains the message "Condition met". This log is not logged everywhere and every time, only when a certain condition was met during a request (this is purely hypothetical). Now I would like to look for the logs that contains this phrase "Condition met". This is easy as I can see all the entries who's message is "Condition met". But now I want to include all the logs for that entry that has the same Correlation ID.
So then I will also see the request and response messages alongside the message "Condition met" that are related.

Is there a way to query or visualize this in Kibana or is it only possible to query this via elastic search? If the latter, how could I ask elastic search to give me those results?

What I've tried so long:

I've setup an elastic search query:

POST my_index/_search
{
  "aggs": {
    "my_aggr": {
      "aggs": {
        "messagefield": {
          "max": {
            "script": "doc['Message.keyword'].size() > 0 && doc['Message.keyword'].value.contains('Condition met')"
          }
        },
        "hasMessage": {
          "bucket_selector": {
            "buckets_path": {
              "var1": "messagefield"
            },
            "script": "params.var1 == 1"
          }
        }
      },
      "terms": {
        "field": "CorrelationId.keyword",
        "size": 10
      }
    }
  },
  
  "size": 0
}

And it returns:

{
  "took" : 661,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "my_aggr" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 263158,
      "buckets" : [
        {
          "key" : "CID001F934C53564805B207F0068A0E0803",
          "doc_count" : 11,
          "messagefield" : {
            "value" : 1.0
          }
        },
        {
          "key" : "CID0091D8BED2664C5F8A5730E249C626C0",
          "doc_count" : 11,
          "messagefield" : {
            "value" : 1.0
          }
        },
        {
          "key" : "CID0231AA20382A4E69A5041C8E10FD6EF1",
          "doc_count" : 11,
          "messagefield" : {
            "value" : 1.0
          }
        },
        {
          "key" : "CID057831D9F1504F21AAAF10955A35D55A",
          "doc_count" : 11,
          "messagefield" : {
            "value" : 1.0
          }
        },
        {
          "key" : "CID05F8E51D7EF64277BF0157787047C731",
          "doc_count" : 11,
          "messagefield" : {
            "value" : 1.0
          }
        },
        {
          "key" : "CID073B8FC7295B43BAB0CCC7021CC1AE73",
          "doc_count" : 11,
          "messagefield" : {
            "value" : 1.0
          }
        }
      ]
    }
  }
}

This is almost what I'm looking for, the only thing now is to actually expand each bucket to include the docs that are aggregated in that bucket. Not sure if that is possible. I don't think I'll be able to setup a query like this using Kibana itself.

first, you are querying elastic directly with dev tools I assume.
second, ya, it is not clear what you are after.
last, why don't you use _search with a query?
optionally use a filter as well.

GET /my-index/_search
{
  "query": {
    "wildcard": {
      "Message.keyword": {
        "value": "*Condition met*"
      }
    }
  }
}

Hope this will give you a direction.

Also,
Why size 0?

Cheers!

Let me try and explain by using some sample entries:

Correlation ID| Message

ID123 | Request Message R1
ID456 | Request Message R2
ID123 | Error message E1
ID123 | Response Message R3
ID456 | Condition met
ID456 | Rsponse Message R4

If I were to query by aggregating by CorrelationID I would get back something like:

"buckets": [
    {
       "key": "ID123",
       "doc_count": 3
    },
    {
       "key": "ID456",
       "doc_count": 3
    }
]

That is expected but I want to only return the ones that have the Message that contains "Condition Met". In the code I presented above I managed to get that far:

"buckets": [
    {
       "key": "ID456",
       "doc_count": 3
    }
]

My question is, is there a way to return it in the following (theoretical) representation (the set of messages grouped by a correlation ID where one of the entries contain "Condition Met"):

"results":[
    {
       "Message": "Request Message R2",
       "CorrelationID": "ID456"
    },
    {
       "Message": "Condition Met",
       "CorrelationID": "ID456"
    },
    {
       "Message": "Response Message R4",
       "CorrelationID": "ID456"
    }
]

Does this make more sense?

Not sure what is the purpose of your search.
Is it getting the ID?
Maybe getting all messages with the same ID where one of the messages has the string you are looking for?

Here are my steps:

I've created an index my-index-000001 with the sample data you provided.
GET /my-index-000001/_search

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "yiPXx3cBlAt-jkNjeejx",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID123",
          "message" : "Request Message R1"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "2yPXx3cBlAt-jkNjxegE",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID123",
          "message" : "Error message E1"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "5CPXx3cBlAt-jkNj-Ogm",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID123",
          "message" : "Response Message R3"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "DSPYx3cBlAt-jkNjxemk",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Request Message R2"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "GyPZx3cBlAt-jkNjDukH",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Condition met"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "IyPZx3cBlAt-jkNjOOnO",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Rsponse Message R4"
        }
      }
    ]
  }
}

Then I run this search

GET /my-index-000001/_search
{
  "query": {
    "wildcard": {
      "message.keyword": {
        "value": "*Condition met*"
      }
    }
  }
}

The results

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "GyPZx3cBlAt-jkNjDukH",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Condition met"
        }
      }
    ]
  }
}

Hi AClerk.

Thanks for setting up a sample. Using that, I'm hoping to get the following results:

{
...
  "hits" : {
   ...
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "DSPYx3cBlAt-jkNjxemk",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Request Message R2"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "GyPZx3cBlAt-jkNjDukH",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Condition met"
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "IyPZx3cBlAt-jkNjOOnO",
        "_score" : 1.0,
        "_source" : {
          "Correlation_ID" : "ID456",
          "message" : "Rsponse Message R4"
        }
      }
    ]
  }
}

I don't want to query for a specific CorrelationID. I want to take all the logs and group them by CorrelationID.
But I don't want the entries from the "ID123" grouping because that grouping doesn't contain "Condition Met". Only the group "ID456" has the entry "Condition Met" and therefore it returns the 3 entries. So it needs to filter out any groups that doesn't contain "Condition Met" as one of the entries. And so it shows the results in the example I supplied above.

I hope it clears up what I am looking for.
Thanks for your replies.

What you are after is Use query result as parameter for another query in Elasticsearch DSL
Try to see if this helps.
Cheers!

1 Like

Sigh, I thought as much. That was what I ended up doing in the end. It gets complicated when the first query returns too much for one page of results and so you have to "paginate" that first and use the outcome of that for your next query.

Thanks for the answer and for your patience.