"Other" bucket: why do you send a special request for that?

jetnet · November 2, 2018, 6:13pm

Hello Kibana dev team!

I'm wondering, why don't you use the value sum_other_doc_count from the very first request and sending a special must_not query for that?
I'm asking because I don't get the other bucket (with aggs "size=1") when data is like that:

PUT test/doc/1
{
  "filterNames": [
    "filter1"
  ]
}

PUT test/doc/2
{
  "filterNames": [
    "filter1",
    "filter2"
  ]
}

I expect the "other" bucket would contain one doc (with "filter2", which is correctly showed by sum_other_doc_count), but because of the second "other-filter" query the "filter2" document gets filtered out and no "other" bucket gets displayed
Do you think, it is the expected behaviour?

Thanks!

P.S. I found an explanation how the "other" bucket works, but it is still not clear, why it has been implemented like that.

nickpeihl · November 2, 2018, 8:12pm

Hi jetnet,

There is a lot more discussion on why this method was chosen in the GitHub pull request where this feature was introduced.

jetnet · November 2, 2018, 9:36pm

yes, I thought it'd been done on purpose.
so, today I discovered a limitation of that implementation.
will try to workaround...

timroes · November 5, 2018, 8:06am

Hi,

let me shortly clarify a bit more in detail to your questions.

Why can't we use sum_other_doc_count?

This would only work for "Count" but for no other metric aggregation the user wants to use. Since our Other Bucket should also work with all the other metric aggregations, we need to do this two query method, to actually calculate the same metrics for all "other" documents.

What documents will be found

To your question if we would expect your doc 2 to be filtered out in Other Bucket: Currently yes, that's expected behavior. But to be honest the way Array values work in Elasticsearch might not always be what you would be expecting depending on your use-case. Sometimes it actually might be the solution you want. So we are doing the "default" ES behavior on filtering out those values, also since I don't currently see any other proper solution (that would work with every metric).

Cheers,
Tim

jetnet · November 6, 2018, 8:19am

hello Tim,

thank you so much, that makes sense.

system · December 4, 2018, 8:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Question about filters - different data displayed Kibana	2	244	May 15, 2019
Is there a way to have an aggregation bucket that delivery the sum of other values Elasticsearch	1	174	September 1, 2023
How to get rid of sum_other_doc_count in aggregations? Elasticsearch	1	1638	July 5, 2017
What does sum_other_doc_count mean exactly? Elasticsearch	3	20260	January 21, 2019
How to get sum_other_doc_count in aggregations? Logstash	1	179	December 15, 2022

"Other" bucket: why do you send a special request for that?

Related topics