Hi, I have a use case need to first groupBy "user" and do some calculation and find the min value of the results.
What I am doing is:
GET test/_search
{
"size": 0,
"aggs": {
"throughput": {
"terms": {
"field": "user.keyword"
},
"aggs": {
"sum_len": {
"sum": {
"field": "processed_length"
}
},
"sum_pt": {
"sum": {
"field": "processed_duration"
}
},
"throu": {
"bucket_script": {
"buckets_path": {
"len": "sum_len",
"pt": "sum_pt"
},
"script": "params.len / params.pt"
}
}
}
},
"min_throu": {
"min_bucket": {
"buckets_path": "throughput>throu"
}
}
}
}
It is working but I have a few questions:
I have about 100,000 users so the buckets number will be large. I need to set size = LARGE NUMBER in term aggregation. Is it safe to do it since I will need really a lot of buckets? Is there an alternate way to this job?
Another question is about Java REST Client, I can do filter_path to filter the response, but I did not find any way to do it with Java Client. Since I just care about the min value, I dont want the response carry redundent data which will make it slow. Is there any way I can reduce the size of response with Java client?