Sum bucket aggregation (high cardinality fields) - optimize query

Hi!

I need to get a specific metric per index. In the query, I have an IP field cardinality and a filtered cardinality of the same IP addresses grouped by a field. I need to subtract the filtered value from the total number and lastly sum all that value to get the final value I'm looking for.
I understand that due to high cardinality and the sheer number of documents, this will always take time to finish, but I'm looking for ways to optimize this while keeping it in one query if possible.

The aggregation part of the query looks like this:

 "aggs": {
   "app_cardinality": {
     "terms": {
       "field": "app_node.keyword",
       "size": 1000
     },
     "aggs": {
       "main_cardinality": {
         "cardinality": {
             "field": "req_ipaddress"
         }
       },
       "testfilter": {
         "filters": {
           "filters": {
             "is_test": {
               "exists": {
                   "field":"is_test"
               }
             }
           }
         },
         "aggs": {
           "sub_cardinality": {
             "cardinality": {
                 "field": "req_ipaddress"
             }
           }
         } 
       },
 	   "helper_bucket": {
 	  	"max_bucket": {
 		"buckets_path": "testfilter>sub_cardinality"
 		}
 	   },
    "not_test_cards": {
         "bucket_script": {
           "buckets_path": {
             "tests": "helper_bucket",
             "all": "main_cardinality"
           },
           "script": "params.all - params.tests"
         }
       }
     }
   },
   "sum_unique": {
       "sum_bucket": {
           "buckets_path": "app_cardinality>not_test_cards" 
       }
   }
 }
}

Any suggestions on how to optimize this query?

Thank you!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.