@Matt, thank you for your prompt response. It appears that achieving my goal isn't straightforward, but I've devised an effective solution tailored to my requirements – obtaining histogram data for values in buckets.
Here's the approach I took:
- Implemented the
filter_path
parameter to exclude buckets exceeding 100,000 from the response.
- Utilized the
percentiles_bucket
aggregation to retrieve values at specified percentile points.
- Opted for using only 20 percentile points, considering that this might suffice when bucket values are evenly distributed.
- In my case, where bucket values exhibited significant skewness beyond 90%, I extended the percentiles above 90 (e.g., 90.1, 90.2, etc.).
- Noted that Elastic's performance wasn't impacted even when passing an array of 1000 percentile points.
- Upon receiving the Elasticsearch response in my backend, extracted:
- The value at each specified percentile point.
- The total count of buckets (obtained from statistical aggregations).
- Utilized this data to:
- Calculate the number of targets falling within each percentile point.
- Normalize percentile points into 20 buckets, summarizing the target counts.
This process has provided me with a precise histogram data array, meeting my specific needs.
Query used:
{
"size": 0,
"aggregations": {
"by_user": {
"terms": {
"field": "user_id",
"size": 250000,
"order": {
"eng_per_target": "desc"
}
},
"aggregations": {
"time_per_user": {
"sum": {
"field": "event_time"
}
},
"cleared": {
"bucket_script": {
"buckets_path": {
"time_per_user": "time_per_user"
},
"script": "Math.min(params.time_per_user , 3600)"
}
}
}
},
"percentage": {
"percentiles_bucket": {
"buckets_path": "by_user>cleared",
"percents": [ 0, 2.5, 5, 7.5 ... ]
]
}
},
"stats": {
"stats_bucket": {
"buckets_path": "by_user>cleared"
}
}
}
}
Actual percents array:
"percents": [0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5, 20, 22.5, 25, 27.5, 30, 32.5, 35, 37.5, 40, 42.5, 45, 47.5, 50, 52.5, 55, 57.5, 60, 62.5, 65, 67.5, 70, 72.5, 75, 77.5, 80, 82.5, 85, 87.5, 90, 90.1, 90.2, 90.3, 90.4, 90.5, 90.6, 90.7, 90.8, 90.9, 91, 91.1, 91.2, 91.3, 91.4, 91.5, 91.6, 91.7, 91.8, 91.9, 92, 92.1, 92.2, 92.3, 92.4, 92.5, 92.6, 92.7, 92.8, 92.9, 93, 93.1, 93.2, 93.3, 93.4, 93.5, 93.6, 93.7, 93.8, 93.9, 94, 94.1, 94.2, 94.3, 94.4, 94.5, 94.6, 94.7, 94.8, 94.9, 95, 95.02, 95.04, 95.06, 95.08, 95.1, 95.12, 95.14, 95.16, 95.18, 95.2, 95.22, 95.24, 95.26, 95.28, 95.3, 95.32, 95.34, 95.36, 95.38, 95.4, 95.42, 95.44, 95.46, 95.48, 95.5, 95.52, 95.54, 95.56, 95.58, 95.6, 95.62, 95.64, 95.66, 95.68, 95.7, 95.72, 95.74, 95.76, 95.78, 95.8, 95.82, 95.84, 95.86, 95.88, 95.9, 95.92, 95.94, 95.96, 95.98, 96, 96.02, 96.04, 96.06, 96.08, 96.1, 96.12, 96.14, 96.16, 96.18, 96.2, 96.22, 96.24, 96.26, 96.28, 96.3, 96.32, 96.34, 96.36, 96.38, 96.4, 96.42, 96.44, 96.46, 96.48, 96.5, 96.52, 96.54, 96.56, 96.58, 96.6, 96.62, 96.64, 96.66, 96.68, 96.7, 96.72, 96.74, 96.76, 96.78, 96.8, 96.82, 96.84, 96.86, 96.88, 96.9, 96.92, 96.94, 96.96, 96.98, 97, 97.02, 97.04, 97.06, 97.08, 97.1, 97.12, 97.14, 97.16, 97.18, 97.2, 97.22, 97.24, 97.26, 97.28, 97.3, 97.32, 97.34, 97.36, 97.38, 97.4, 97.42, 97.44, 97.46, 97.48, 97.5, 97.52, 97.54, 97.56, 97.58, 97.6, 97.62, 97.64, 97.66, 97.68, 97.7, 97.72, 97.74, 97.76, 97.78, 97.8, 97.82, 97.84, 97.86, 97.88, 97.9, 97.92, 97.94, 97.96, 97.98, 98, 98.02, 98.04, 98.06, 98.08, 98.1, 98.12, 98.14, 98.16, 98.18, 98.2, 98.22, 98.24, 98.26, 98.28, 98.3, 98.32, 98.34, 98.36, 98.38, 98.4, 98.42, 98.44, 98.46, 98.48, 98.5, 98.52, 98.54, 98.56, 98.58, 98.6, 98.62, 98.64, 98.66, 98.68, 98.7, 98.72, 98.74, 98.76, 98.78, 98.8, 98.82, 98.84, 98.86, 98.88, 98.9, 98.92, 98.94, 98.96, 98.98, 99, 99.02, 99.04, 99.06, 99.08, 99.1, 99.12, 99.14, 99.16, 99.18, 99.2, 99.22, 99.24, 99.26, 99.28, 99.3, 99.32, 99.34, 99.36, 99.38, 99.4, 99.42, 99.44, 99.46, 99.48, 99.5, 99.52, 99.54, 99.56, 99.58, 99.6, 99.62, 99.64, 99.66, 99.68, 99.7, 99.72, 99.74, 99.76, 99.78, 99.8, 99.82, 99.84, 99.86, 99.88, 99.9, 99.92, 99.94, 99.96, 99.98]