Query retrieving documents when only one of two conditions are found under same aggregation

There is a process that logs every step of the way. I am looking for processes that have the log informing the process start but don't have the end process log.
Log example:

{
   "app": "myapp",
    "content": "End Process - abc",
    "correlationid": "abcdef5b-dd9e-40fc-8006-5c11a3567876",
    "rec_date": "2023-10-25T06:50:12.123Z",
}
{
   "app": "myapp",
    "content": "Begin Process - abc",
    "correlationid": "abcdef5b-dd9e-40fc-8006-5c11a3567876",
    "rec_date": "2023-10-25T06:50:11.123Z",
}

Having both logs with the same correlationid and same app, the query should return nothing. If one of them is missing, it should return the correlationid.

The challenge for me is to identify when only one of them was found. I wrote the query below using rare_terms, but it is not working as expected:

GET log/_search
{
 "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "filter": [
              {
                "bool": {
                  "must": [
                    {
                      "match_phrase": {
                        "app": "myapp"
                      }
                    }
                  ]
                }
              },
              {
                "bool": {
                  "should": [
                    {
                      "bool": {
                        "should": [
                          {
                            "match_phrase": {
                              "content": "Begin Process -*"
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "match_phrase": {
                              "content": "End Process -*"
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ]
          }
        },
        {
          "range": {
            "rec_date": {
              "format": "strict_date_optional_time",
              "gte": "2023-10-25T06:50:00.000Z",
              "lte": "2023-10-25T06:55:00.000Z"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "enter_exit": {
      "rare_terms": {
        "field": "correlationid.keyword",
        "max_doc_count": 1
      }
    },
    "bucketcount": {
      "stats_bucket": {
        "buckets_path": "enter_exit._count"
      }
    }
  }
}

(bucketcount is required for the watcher to know that there was a hit)

The response returns the record that contains both logs:

{
  "took" : 111,
  "timed_out" : false,
  "_shards" : {
    "total" : 1270,
    "successful" : 1270,
    "skipped" : 1245,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "enter_exit" : {
      "buckets" : [
        {
          "key" : "abcdef5b-dd9e-40fc-8006-5c11a3567876",
          "doc_count" : 1
        }
      ]
    },
    "bucketcount" : {
      "count" : 1,
      "min" : 1.0,
      "max" : 1.0,
      "avg" : 1.0,
      "sum" : 1.0
    }
  }
}

I appreciate any help.

Thanks,
Rob

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.