Filter by histogram bucket key


(Tomás Tarragón) #1

Hello, everyone!

Setup

ElasticSearch 5.4

I have 2 types of document: Contact, and Message. Both with a created_on field.
Message's parent is Contact.

Goal

I'm trying to make a date histogram with 2 values nested in each bucket: total of active users, and how many of them are returning users.

  • An active user would be a user with a message in the interval.
  • A returning user would be a user with a message in the interval, and a message before the interval.

Problem

I'd like to add a filter bucket to each of the histogram buckets, but that filter would need to access the interval's key (the date).

I can't seem to get this right. I've tried using scripts, but I can't access the key.

I've also tried to do the intervals manually, but it doesn't seem to be a very elegant an efficient solution.

_______________________________

Any suggestions?

Thank you in advance!


(Thiago Souza) #2

You should be able to solve this problem with Filter Aggregation


(Jacobo) #3

Thanks for the suggestion @thiago, but how would you reference the bucket key inside the nested filter aggregation? That's the key piece we're missing.

Cheers!


(Thiago Souza) #4

You don't apply the filter on the key of the buckets, instead you would apply a range filter aggregation (nested inside a date histogram) on the timestamp field itself and the resulting date histogram will include only the relevant dates.


(Tomás Tarragón) #5

Thanks for your response!

However, I still don't understand how to do this. I can nest a range filter aggregation inside the date histogram, yes, but how can I get the values of each histogram bucket start time?

Maybe an example can help. This is a query for Contact. Inside returning, I'd like to filter the contacts in each interval that have a message before the interval.

How could I fill lte to reference the interval start?

'aggs': {
    'by_interval': {
        'date_histogram': {
            'time_zone': 'UTC',
            'field': 'created_on',
            'interval': 'day',
            'min_doc_count': 0,
            'extended_bounds': {
                'min': 1505858400,
                'max': 1506410687
            }
        },
        'aggs': {
            'returning': {
                'filter': {
                    'has_child': {
                        'query': {
                            'range': {
                                'created_on': {
                                    'lte': ???
                                }
                            }
                        },
                        'type': 'message'
                    }
                }
            },
        }
    }
}

Thank you very much for your time!


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.