Using doc_count to calculate percentage after aggregation

Dear all,
I am currently attempting to use the doc_count field (result of aggregation operation: aggregations.aggregation_name.buckets -> doc_count for each bucket) to calculate some percentages, namely pass and fail for test cases run. After running terms aggregation I am getting the necessary buckets, e.g.
"key" : "PASS",
"doc_count" : 100,
and
"key" : "FAIL",
"doc_count" : 10,
Is there a way to extract the doc_count in the above results so that then I can use a bucket_script to calculate the percent value please ?
Thanking you.

try this:

DELETE test

PUT test/_bulk
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "FAIL" }

GET test/_search
{
  "size": 0,
  "aggs": {
    "filters_agg": {
      "filters": {
        "filters": {
          "first": {
            "match_all": {}
          }
        }
      },
      "aggs": {
        "by_result": {
          "terms": {
            "field": "result.keyword",
            "size": 10
          }
        },
        "pass_fail_relation": {
          "bucket_script": {
            "buckets_path": {
              "pass": "by_result['PASS']>_count",
              "fail": "by_result['FAIL']>_count"
            },
            "script": "params.fail/(params.pass + params.fail)"
          }
        }
      }
    }
  }
}
1 Like

Thank you so much @spinscale for your advice. Your code snippet helps me understand how to extract the aggregation counts and use them in a script. The issue I am facing is when I try to use this method for my use-case:
I am getting the following exception related to the level of aggregation allowed:
"Invalid pipeline aggregation named [pass_fail_relation] of type [bucket_script]. Only sibling pipeline aggregations are allowed at the top level"

The way I am trying to use your method is the following:

GET /ls-qat-*/_search
{
  "query": { 
    "bool": {
      "must": [
        {
          "match": { "fields.product.application.keyword": "some_application" }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now-7d/d",  
              "lte": "now/d"
            }
          }
        }
      ]
    }
  },
  "aggs" : {
      "testcase_status_aggregation" : {
          "terms" : { "field" : "testcase.status.keyword", "size": 3, "show_term_doc_count_error": true }
    },
    
    "pass_fail_relation": {
          "bucket_script": {
            "buckets_path": {
              "pass": "testcase_status_aggregation['PASS']>_count",
              "fail": "testcase_status_aggregation['FAIL']>_count"
            },
            "script": "params.fail / (params.pass + params.fail) * 100"
          }
        }
    
  },
  "size" : 0
}

test.status.keyword contains either "PASS" or "FAIL" for a testcase.
If you can please advise how to alleviate this issue.
Thanking you.

Please create a fully reproducible example including index creation with mapping, document indexing and query, otherwise this is really hard to reproduce.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.