Terms Agg

I set size and shard_size to 4200 , but still i was not getting the intended results. So i set to less number 500 (both size and shard_size) to see if that makes any difference. But still no difference.

sorry for the confusion.

What did you get? Were there any failures reported in the response status or the JSON body?

I was not getting empty buckets for the terms that do not have data. There was no error though.

So you should have had exactly 4,200 term buckets under each date bucket. How many buckets did you get?
Was this perhaps because some of the date buckets had exactly zero hits, in which case it didn't descend into the child terms aggregation?

thats good point, i will check from that perspective.

One more point to note - for the timeframe i am searching there are only 6 terms that would be returned. In my response some date has 4 and some date has 5 and some has 6.

One possible factor is that this optimisation avoids even running the query on some shards if they lie outside of the time range for the query.
This would mean that terms that only exist on excluded shards would not appear in results.

If this is the reason behind the absence I don't suggest you trawl indices outside of your date range just to come up with the missing values - that would be very inefficient compared to an approach where your client can deduce the missing terms with some custom logic

If the terms outside the time range is not returned that would be fine. But in my case for a given time range - one timestamp has 5 buckets, one time stamp has 6 buckets and another has 4 buckets. I would expect all the time in the time range should have same number of buckets. Let me know if my understanding is right ?

Not if the buckets are taken from different (time-based) indices. They each could have a different set of terms.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.