Using doc_count to calculate percentage after aggregation

surjray · August 3, 2020, 6:18pm

Dear all,
I am currently attempting to use the doc_count field (result of aggregation operation: aggregations.aggregation_name.buckets -> doc_count for each bucket) to calculate some percentages, namely pass and fail for test cases run. After running terms aggregation I am getting the necessary buckets, e.g.
"key" : "PASS",
"doc_count" : 100,
and
"key" : "FAIL",
"doc_count" : 10,
Is there a way to extract the doc_count in the above results so that then I can use a bucket_script to calculate the percent value please ?
Thanking you.

spinscale · August 4, 2020, 9:45am

try this:

DELETE test

PUT test/_bulk
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "PASS" }
{ "index" : {}}
{ "result": "FAIL" }

GET test/_search
{
  "size": 0,
  "aggs": {
    "filters_agg": {
      "filters": {
        "filters": {
          "first": {
            "match_all": {}
          }
        }
      },
      "aggs": {
        "by_result": {
          "terms": {
            "field": "result.keyword",
            "size": 10
          }
        },
        "pass_fail_relation": {
          "bucket_script": {
            "buckets_path": {
              "pass": "by_result['PASS']>_count",
              "fail": "by_result['FAIL']>_count"
            },
            "script": "params.fail/(params.pass + params.fail)"
          }
        }
      }
    }
  }
}

surjray · August 4, 2020, 4:38pm

Thank you so much @spinscale for your advice. Your code snippet helps me understand how to extract the aggregation counts and use them in a script. The issue I am facing is when I try to use this method for my use-case:
I am getting the following exception related to the level of aggregation allowed:
"Invalid pipeline aggregation named [pass_fail_relation] of type [bucket_script]. Only sibling pipeline aggregations are allowed at the top level"

The way I am trying to use your method is the following:

GET /ls-qat-*/_search
{
  "query": { 
    "bool": {
      "must": [
        {
          "match": { "fields.product.application.keyword": "some_application" }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now-7d/d",  
              "lte": "now/d"
            }
          }
        }
      ]
    }
  },
  "aggs" : {
      "testcase_status_aggregation" : {
          "terms" : { "field" : "testcase.status.keyword", "size": 3, "show_term_doc_count_error": true }
    },
    
    "pass_fail_relation": {
          "bucket_script": {
            "buckets_path": {
              "pass": "testcase_status_aggregation['PASS']>_count",
              "fail": "testcase_status_aggregation['FAIL']>_count"
            },
            "script": "params.fail / (params.pass + params.fail) * 100"
          }
        }
    
  },
  "size" : 0
}

test.status.keyword contains either "PASS" or "FAIL" for a testcase.
If you can please advise how to alleviate this issue.
Thanking you.

spinscale · August 12, 2020, 8:14am

Please create a fully reproducible example including index creation with mapping, document indexing and query, otherwise this is really hard to reproduce.

system · September 9, 2020, 8:15am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Doc_count calculations Elasticsearch	2	617	July 5, 2017
How to find percentage using doc_count in elastic query Elasticsearch	1	349	March 27, 2020
Calculating percent Elasticsearch	2	1403	February 4, 2022
Getting doc_count in percentiles aggregation Elasticsearch	4	1599	January 30, 2018
Combining two aggregations to get term percentage Elasticsearch	5	14239	July 6, 2017

Using doc_count to calculate percentage after aggregation

Related topics