Elasticsearch-5: Aggregation based on query without per-doc results?


(James Green) #1

We have an application that is essentially a reporting tool. It is interested in aggregate data, not individual documents.

The queries involved are along the lines of:

{body: { query: { bool (...) }, size: 1, aggregations: { name: { (...), size: 2000}}}}

Two questions:

  1. I've been unable to tell the server that I don't want any non-aggregate documents to come back so I have to accept one back (hence size:1). As I am only interested in the aggregation results, this feels like a hack - is there a better way of limiting the scope of my aggregation search and thus avoiding having matched documents returned?

  2. My application expects the aggregation results to be returned in full. In our case it's unlikely to result in more than 2,000 documents so I've set this as the size however this too feels like a hack. Is there a better way?

Thanks,
James


(Christoph) #2

This is odd, whats wrong with ommiting the "query" part and setting the size to 0 if you don't want any search hits? I might be misunderstanding the question though.

What kind of aggregation result is this? As far as I know the size parameters in aggregations are specific to the kind of aggregation you are using (e.g. terms)


(James Green) #3

The query limits the aggregation to the documents of interest (a where clause, if you will). For instance, those within a particular time range.

Using size: 0 results in an error that the value must be a positive integer.

I have a date_histogram and aggregations including terms, cardinality and sum. The terms ones are bounded with a size after we discovered that without this parameter only the first ten results would be returned.


(Christoph) #4

Thats the odd part. It shouldn't do that. You are essentially scoping your aggregation as described here. Setting a size of 0 to omitt the seach results while still having a query to scope the aggregation is the normal thing to do here. Please check those examples, maybe you can spot the difference.

Yes, you explicitely need to set the size for the terms aggregation since it will also affect the way the aggregation is computed as explained here.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.