Possible to access sibling aggregation value within sibling child aggregation?

jlrivera81 · October 9, 2024, 1:54pm

Hello,
I have a test pipeline that is triggered for every code commit. The code commit has a unique "revision" and you can determine commit-order based on the revision's associated "revision_order". Additionally, each subsequent commit in the codebase contains every previous commit's code.

When pipeline is triggered, a series of tests are run against a gate named "Integration A". The tests are split within this gate into suites, one of which is named "green"
None of the other suites in gate "Integration A" will run until all tests in suite "green" pass, which is ~3 tests.

In terms of when the pipeline is triggered, there is no guarantee of commits triggering the pipeline in the order they are commited, as it is dependent on various build environment factors.
For example
commit-1 may build quickly and trigger pipeline first
commit-2 may take longer
commit-3 may be faster than build 2 and trigger pipeline earlier

Additionally, the time it takes for "green" suite tests to complete can vary (30 minutes to 1 hour). so even if
commit-1 trigger test pipeline first
commit-3 triggers 2nd
commit-2 triggers 3rd
there is no guarantee that tests will finish running in that order

For monitoring the pipeline, it's critical that we are alerted if the green suite is continuously stuck/failing. More importantly b/c each new commit contains code from all previous commits, it's important that we capture green failures for ordered commit history in hopes of determining what commit may have triggered all subsequent failures.
Additionally, for any new commit that comes in, if it passes "green", we can assume that any previous commits that failed "green" can be ignored for purposes of alerts.

for example, lets say the following commits trigger green suite and lead to the following results
commit-1 @ 12pm - fail
commit-2 @ 12:05pm - fail
commit-3 @ 12:30pm - pass
commit-4 @ 12:31pm - pass
commit-5 @ 1:00pm - fail
commit-6 @ 1:45pm - pass
commit-7 through commit-90 - pass between 2 and 6pm
commit-91 though commit 100 - fail between 6pm and 8pm

remember that each previous commit contains earlier commit code. So even though commits 1 and 2 both fail, we can assume that b/c commit 3 and 4 have passed, we dont have to worry about commits 1 and 2. In other words, if there are commit's that "pass" between "failed" commits, we dont care about those. we only care about those failures such that for a given set of commits ordered by "revision-str", if there are 3 more failures with no passed commits in between or later, we must be alerted.
Such is the case with commit-91 through 100.
In this case, we want to make sure that when the alert looks for docs for last and sees commit 1, 2, 5, and 91-100 that it discards 1,2, and 5 b/c of the "passes" that occurred after them and before commit 91.

I made the following attempt using DSL query and tested in console

GET /resultsforwarder-int/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        { "term": { "gate_name.keyword": "Integration A" } },
        { "range": { "revision_order": { "gt": 0 } } },
        { "range": { "@timestamp": { "gte": "now-1w", "lte": "now" } } }
      ]
    }
  },
  "aggs": {
    "passed_green": {
      "terms": {
        "field": "revision_order",
        "size": 100,
        "order": { "_key": "desc" }
      },
      "aggs": {
        "count_all_docs": {
          "value_count": { "field": "revision.keyword" }
        },
        "count_green_suite": {
          "filter": { "term": { "suite_name.keyword": "green" } },
          "aggs": {
            "count_docs": {
              "value_count": { "field": "revision.keyword" }
            }
          }
        },
        "green_passed_filter": {
          "bucket_selector": {
            "buckets_path": {
              "all_docs": "count_all_docs",
              "green_docs": "count_green_suite>count_docs"
            },
            "script": "params.all_docs > params.green_docs"
          }
        }
      }
    },
    "max_passed_green_revision_order": {
      "max_bucket": {
        "buckets_path": "passed_green>_key"
      }
    },
    "group_by_revision_order": {
      "terms": {
        "field": "revision_order",
        "size": 100,
        "order": { "_key": "desc" }
      },
      "aggs": {
        "count_all_docs": {
          "value_count": { "field": "revision.keyword" }
        },
        "count_green_suite": {
          "filter": { "term": { "suite_name.keyword": "green" } },
          "aggs": {
            "count_docs": {
              "value_count": { "field": "revision.keyword" }
            }
          }
        },
        "count_failed_green_suite": {
          "filter": {
            "bool": {
              "must": [
                { "term": { "status.keyword": "failed" } },
                { "term": { "suite_name.keyword": "green" } }
              ]
            }
          },
          "aggs": {
            "count_docs": {
              "value_count": { "field": "revision.keyword" }
            }
          }
        },
        "revision_filter": {
          "bucket_selector": {
            "buckets_path": {
              "all_docs": "count_all_docs",
              "green_docs": "count_green_suite>count_docs",
              "failed_green_docs": "count_failed_green_suite>count_docs"
            },
            "script": "params.all_docs == params.green_docs && params.green_docs >= 34 && params.failed_green_docs >= 1"
          }
        },
        "remove_lower_than_max_passed_green": {
          "bucket_selector": {
            "buckets_path": {
              "current_revision_order": "_key",
              "max_passed_green_revision_order": "max_passed_green_revision_order.value"
            },
            "script": "params.current_revision_order >= params.max_passed_green_revision_order"
          }
        }
      }
    }
  }
}

When testing the above query, if I remove the "remove_lower_than_max_passed_green" aggregation, I get the expected results. However, when I add the "remove_lower_than_max_passed_green" aggregation, I start running into issues.
I'm not sure if this is the right approach -- I'm trying to get the max revision_order that passed green suite and then filter out any revision_order that is less than that value.
From documentation regarding sibling aggregations: Pipeline aggregations | Elasticsearch Guide [8.15] | Elastic
I thought that I could use the "max_bucket" aggregation to get the max revision_order that passed green suite. Then I could use that value to filter out any revision_order that is less than that value. However, I get the following error

{
	"error": {
	  "root_cause": [
		{
		  "type": "action_request_validation_exception",
		  "reason": "Validation Failed: 1: No aggregation found for path [max_passed_green_revision_order.value];"
		}
	  ],
	  "type": "action_request_validation_exception",
	  "reason": "Validation Failed: 1: No aggregation found for path [max_passed_green_revision_order.value];"
	},
	"status": 400
  }

Is what I want to achieve possible ?