"Other" bucket: why do you send a special request for that?


#1

Hello Kibana dev team!

I'm wondering, why don't you use the value sum_other_doc_count from the very first request and sending a special must_not query for that?
I'm asking because I don't get the other bucket (with aggs "size=1") when data is like that:

PUT test/doc/1
{
  "filterNames": [
    "filter1"
  ]
}

PUT test/doc/2
{
  "filterNames": [
    "filter1",
    "filter2"
  ]
}

I expect the "other" bucket would contain one doc (with "filter2", which is correctly showed by sum_other_doc_count), but because of the second "other-filter" query the "filter2" document gets filtered out and no "other" bucket gets displayed :frowning:
Do you think, it is the expected behaviour?

Thanks!

P.S. I found an explanation how the "other" bucket works, but it is still not clear, why it has been implemented like that.


(Nick Peihl) #2

Hi jetnet,

There is a lot more discussion on why this method was chosen in the GitHub pull request where this feature was introduced.


#3

yes, I thought it'd been done on purpose.
so, today I discovered a limitation of that implementation.
will try to workaround...


(Tim Roes) #4

Hi,

let me shortly clarify a bit more in detail to your questions.

Why can't we use sum_other_doc_count?

This would only work for "Count" but for no other metric aggregation the user wants to use. Since our Other Bucket should also work with all the other metric aggregations, we need to do this two query method, to actually calculate the same metrics for all "other" documents.

What documents will be found

To your question if we would expect your doc 2 to be filtered out in Other Bucket: Currently yes, that's expected behavior. But to be honest the way Array values work in Elasticsearch might not always be what you would be expecting depending on your use-case. Sometimes it actually might be the solution you want. So we are doing the "default" ES behavior on filtering out those values, also since I don't currently see any other proper solution (that would work with every metric).

Cheers,
Tim


#5

hello Tim,

thank you so much, that makes sense.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.