Percentile Rank query giving unexpected results

Hi,

I'm getting unexpected results for the percentile_rank aggregation, and was looking for some help/clarification.

I've got 9 documents, all of which have a position field.
The position values for the 9 documents are:

[1, 1, 1, 1, 2, 2, 3, 4, 8]

Now when I do the following percentile_rank aggregate

{
  "size": 0,
  "aggs": {
    "percentile_test": {
      "percentile_ranks": {
        "field": "position",
        "values": [
          1,
          3
        ]
      }
    }
  }
}

The documentation at Percentile ranks aggregation | Elasticsearch Guide [8.11] | Elastic says that

Percentile rank shows the percentage of observed values which are below certain value

implying that for my value of 1, I should actually get a value of 0, since I have no documents with a position of less than 1.

Instead, I get

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 9,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "percentile_test": {
      "values": {
        "1.0": 33.33333333333333,
        "3.0": 72.22222222222221
      }
    }
  }
}

which states that my percentile rank for documents having a position of less than 1, is 33.33%. i.e., a third of my documents.
Similarly, with 6 out of 9 documents having a value of less than 3, I'd expect the percentile rank for 3 to be 66.66%. Instead, It's coming back as 72.22%.

I'm aware of the caveat in Percentiles aggregation | Elasticsearch Guide [8.11] | Elastic talking about how percentiles are approximate, however this seems a bit weird. I mean, in this case, I'm asking for the percent of values under the smallest value:

  1. there's nothing that is smaller, so why is that ever coming up with a value > 0?
  2. the same link says that for small document sets, it is highly accurate, potentially even being 100% accurate. And I think 9 documents is a very very small data set...

Can anybody provide some help/answers/opinions/perspective please?

I also note that similar questions about percentile rank have been asked in the past, without any answers, so clearly this is something that others have found puzzling :frowning:

Thanks in advance!

Bump?
Nobody else able to respond to this with some experience?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.