Aggregation on terms and find min value of script results

Hi, I have a use case need to first groupBy "user" and do some calculation and find the min value of the results.
What I am doing is:

GET test/_search
{
  "size": 0,
  "aggs": {
    "throughput": {
      "terms": {
        "field": "user.keyword"
      },
      "aggs": {
        "sum_len": {
          "sum": {
            "field": "processed_length"
          }
        },
        "sum_pt": {
          "sum": {
            "field": "processed_duration"
          }
        },
        "throu": {
          "bucket_script": {
            "buckets_path": {
              "len": "sum_len",
              "pt": "sum_pt"
            },
            "script": "params.len / params.pt"
          }
        }
      }
    },
    "min_throu": {
      "min_bucket": {
        "buckets_path": "throughput>throu"
      }
    }
  }
}

It is working but I have a few questions:
I have about 100,000 users so the buckets number will be large. I need to set size = LARGE NUMBER in term aggregation. Is it safe to do it since I will need really a lot of buckets? Is there an alternate way to this job?

Another question is about Java REST Client, I can do filter_path to filter the response, but I did not find any way to do it with Java Client. Since I just care about the min value, I dont want the response carry redundent data which will make it slow. Is there any way I can reduce the size of response with Java client?

Need help. Hope someone can share some ideas or hints.

I might a scripted metric under the terms and sorting the terms on that. I think that works. Scripted metric is always a bit fiddly and slow but it can push the math to the shards and stop you from having to pull everything back to the coordinating node.

Thanks for reply. Are you suggesting use Scripted Metric Aggregation to directly do the calculation and for a term, here for a user. And then do Min Aggregation or Sorting based on that?
Can I have more details or some sample code?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.