Hi,
I'm getting unexpected results for the percentile_rank
aggregation, and was looking for some help/clarification.
I've got 9 documents, all of which have a position
field.
The position
values for the 9 documents are:
[1, 1, 1, 1, 2, 2, 3, 4, 8]
Now when I do the following percentile_rank
aggregate
{
"size": 0,
"aggs": {
"percentile_test": {
"percentile_ranks": {
"field": "position",
"values": [
1,
3
]
}
}
}
}
The documentation at Percentile ranks aggregation | Elasticsearch Guide [8.11] | Elastic says that
Percentile rank shows the percentage of observed values which are below certain value
implying that for my value of 1
, I should actually get a value of 0
, since I have no documents with a position of less than 1.
Instead, I get
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 9,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"percentile_test": {
"values": {
"1.0": 33.33333333333333,
"3.0": 72.22222222222221
}
}
}
}
which states that my percentile rank for documents having a position of less than 1, is 33.33%
. i.e., a third of my documents.
Similarly, with 6 out of 9 documents having a value of less than 3, I'd expect the percentile rank for 3
to be 66.66%. Instead, It's coming back as 72.22%
.
I'm aware of the caveat in Percentiles aggregation | Elasticsearch Guide [8.11] | Elastic talking about how percentiles are approximate, however this seems a bit weird. I mean, in this case, I'm asking for the percent of values under the smallest value:
- there's nothing that is smaller, so why is that ever coming up with a value > 0?
- the same link says that for small document sets, it is highly accurate, potentially even being 100% accurate. And I think 9 documents is a very very small data set...
Can anybody provide some help/answers/opinions/perspective please?
I also note that similar questions about percentile rank have been asked in the past, without any answers, so clearly this is something that others have found puzzling
Thanks in advance!