Possibility to not output intermediate aggregations


(Tobsucht) #1

Hi,

is there any possibility to NOT output the result of an intermediate aggregation? This seems to take a lot of time depending on the resultset.

What I want to achieve is the following: I have a mapping for items, which have a nested field for prices, as one item can have multiple (every price has a catalogid, which a user may access or not). If a user searches for items, I want to aggregate the min/max price of the set to provide a price filter.
So what I do is: Aggreate by itemid -> filter unaccessible prices -> get the lowest prices per item -> get the lowest/highest price of all aggregated min prices

AggregationBuilders.terms("groupById")
                              .field("itemId")
                              .size(100000)
                              .subAggregation(AggregationBuilders.nested("nestedPrices", "prices")
                                                                 .subAggregation(AggregationBuilders.filter(
                                                                         "filteredPrices",
                                                                         buildCatalogFilter())
                                                                                                    .subAggregation(
                                                                                                            AggregationBuilders
                                                                                                                    .min("minPrice")
                                                                                                                    .field("prices"
                                                                                                                           + "."
                                                                                                                           + "price"))));

and

new MinBucketPipelineAggregationBuilder("minPriceOfAllItems",
                                                   "groupById>nestedPrices>filteredPrices>minPrice")

As a searchrequest may return more than just a few items the aggregation "groupById" can get pretty big and I don't need the intermediate result of it anyway. Is there a way to NOT output the result of such an intermediate aggregation? Or is there a better way of aggregating the needed information?

Update: as the aggregations are slow for small sizes ... is there a better way to aggregate the data?


(Zachary Tong) #2

You can use filter_path to hide the output you don't want to see: https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#common-options-response-filtering

It'll still be computed, it's just filtered out before sending over the wire. So there may be a negligibly faster response due to less wire transfer, but it's most there for convenience.

Small sizes as in small matching sets, or small index? Aggregation speed is generally he number of documents that must be evaluated, regardless of how many actually end up in the result set. So if you have to evaluate the entire index, it'll be slower than if you can add some kind of filtering criteria.

The Profile API can help identify what part of your aggregation is being slow too: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-profile.html


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.