Aggregation with filters causing OutOfMemoryError

kufi · July 1, 2015, 2:57pm

Hello

We currently have a problem with one of our queries, but only in specific circumstances. Basically, the query runs without problems as long as the amount of filters is not really big. This is ok for most of our queries, but one specific query generates around 3000 filter buckets which cause an OutOfMemoryError. Lower numbers of buckets are okay and also perform fast enough on the amount of documents we currently store.

Is there a way to prevent this?

The query looks like this:

{
    {
        "filtered": {
            "filter": {
                "and": [
                    {
                        "nested": {
                            "path": "tags",
                            "filter": {
                                ...
                            }
                        }
                    },
                    {
                        "term": {
                            "field": value
                        }
                    }
                ]
            }
        }
    },
    "aggs": {
        "distribution": {
            "filters": {
                "filters": {
                    "filter_one": {
                        "nested": {
                            "path": "tags",
                            "filter": {
                                ...
                            }
                        }
                    },
                    ...more filters here, same structure as above. If too many filter buckets present, the query crashes
                }
            },
            "aggs": {
                "timeline": {
                    "nested": {
                        "path": "nested"
                    },
                    "aggs": {
                        "timeline_filter": {
                            "filter": {
                                ...
                            },
                            "aggs": {
                                "timeline_histogram": {
                                    "date_histogram": {
                                        "field": "nested.created_at",
                                        "interval": "day"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

warkolm · July 2, 2015, 2:45am

Add more heap or reduce your bucket count is about it unfortunately.

kufi · July 2, 2015, 1:28pm

Ok. Good to know, because I ran out of ideas besides the "more power" route.

I'm thinking about splitting the buckets up into multiple requests and patching the results together after that, as the end results itself are quite easy to combine.

warkolm · July 2, 2015, 9:32pm

That's definitely another viable option if you can.

Topic		Replies	Views
Why ES doesn't stop my aggregation but just crashes? Elasticsearch	5	7509	October 10, 2018
Memory problems while querying Elasticsearch	4	437	October 8, 2018
Aggregations Elasticsearch	7	537	July 6, 2017
Several difficulties with elasticsearch DSL Elasticsearch	4	918	November 29, 2018
ES aggregation query Elasticsearch	5	372	November 23, 2020

Aggregation with filters causing OutOfMemoryError

Related topics