Restrict buckets on sub-aggregate counts?

photonic_world_2 · June 15, 2016, 12:19am

Hello,

Here is a query which I am trying:

I have a bunch of users as a string array in each of my document.

{
"users":["user1", "user2"]
"access":"permitted"
}

document 2

{
"users":["user1", "user3"]
"access":"blocked"
}

document 3

{
"users":["user5", "user6"]
"access":"blocked"
}

I want to see if its possible to aggregate by users while getting counts for say 'permitted' users only
so the response would be something like:

{
"aggregations":{
"buckets": [
{
"key": "user1",
"doc_count": 2,
"access": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "permitted",
"doc_count": 1
},
{
"key": "blocked",
"doc_count": 1
}
]
}
}
}
}

Query like below could help but global scope can not be set on the sub aggregates

{
"size": 0,
"query": {
"constant_score": {
"filter": {
"term": {
"access": "permitted"
}
}
}
},
"aggregations": {
"by_users": {
"terms": {
"field": "users",
"size": 10,
"shard_size": 0
},
"aggs": {
"access": {
"global": {},
"aggs":{
"by_access":{
"terms": {
"field": "access"
}
}
}
}
}
}
}
}

I can get this with two queries one to fetch all the unique users and then fetch aggregate by access over those users, the second query would be huge as number of users there would be huge (say in thousands and millions)

Is there a better way to do this?

Mark_Harwood · June 15, 2016, 11:17am

Your scenario ticks all the boxes for creating a "bucket explosion" problem [1].
Consider creating a user-centric index for this type of analysis.

[1] https://speakerdeck.com/elastic/building-entity-centric-indexes?slide=5

Topic		Replies	Views
Restricting Bucket Aggregation to Certain Values Elasticsearch	3	748	July 6, 2017
Take top n Documents per buckets and do further sub-aggregation Elasticsearch	1	372	July 6, 2017
Get sum of record count for inner bucket key in two level term aggregation Elasticsearch	1	183	March 27, 2023
Constrain Terms Aggregation by Doc Count Range Elasticsearch	2	1574	April 13, 2017
Questions about terms aggregations and too many buckets exception Elasticsearch	1	322	January 25, 2023

Restrict buckets on sub-aggregate counts?

Related topics