Filter by histogram bucket key

tomisu · September 21, 2017, 1:39pm

Hello, everyone!

Setup

ElasticSearch 5.4

I have 2 types of document: Contact, and Message. Both with a created_on field.
Message's parent is Contact.

Goal

I'm trying to make a date histogram with 2 values nested in each bucket: total of active users, and how many of them are returning users.

An active user would be a user with a message in the interval.
A returning user would be a user with a message in the interval, and a message before the interval.

Problem

I'd like to add a filter bucket to each of the histogram buckets, but that filter would need to access the interval's key (the date).

I can't seem to get this right. I've tried using scripts, but I can't access the key.

I've also tried to do the intervals manually, but it doesn't seem to be a very elegant an efficient solution.

_______________________________

Any suggestions?

Thank you in advance!

thiago · September 23, 2017, 3:50am

You should be able to solve this problem with Filter Aggregation

jminuscula · September 25, 2017, 10:28am

Thanks for the suggestion @thiago, but how would you reference the bucket key inside the nested filter aggregation? That's the key piece we're missing.

Cheers!

thiago · September 25, 2017, 6:29pm

You don't apply the filter on the key of the buckets, instead you would apply a range filter aggregation (nested inside a date histogram) on the timestamp field itself and the resulting date histogram will include only the relevant dates.

tomisu · September 26, 2017, 9:43am

Thanks for your response!

However, I still don't understand how to do this. I can nest a range filter aggregation inside the date histogram, yes, but how can I get the values of each histogram bucket start time?

Maybe an example can help. This is a query for Contact. Inside returning, I'd like to filter the contacts in each interval that have a message before the interval.

How could I fill lte to reference the interval start?

'aggs': {
    'by_interval': {
        'date_histogram': {
            'time_zone': 'UTC',
            'field': 'created_on',
            'interval': 'day',
            'min_doc_count': 0,
            'extended_bounds': {
                'min': 1505858400,
                'max': 1506410687
            }
        },
        'aggs': {
            'returning': {
                'filter': {
                    'has_child': {
                        'query': {
                            'range': {
                                'created_on': {
                                    'lte': ???
                                }
                            }
                        },
                        'type': 'message'
                    }
                }
            },
        }
    }
}

Thank you very much for your time!

system · October 24, 2017, 9:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need to access DATA_HISTOGRAM buckets response keys in child aggregation Elasticsearch	1	385	June 4, 2020
How can I construct a filter that matches a bucket from a date_histogram aggregation? Elasticsearch	2	531	November 7, 2017
Pre-filter data for date_histogram aggregations Elasticsearch	1	881	July 5, 2017
Date_histogram buckets not as expected Elasticsearch	10	911	March 30, 2017
Date range filtering does not affect date histogram aggregation Elasticsearch	1	2586	January 15, 2018

Filter by histogram bucket key

Setup

Goal

Problem

_______________________________

Related topics