Hi,
So I have a very complicated query that involves:
- terms filters
- custom function scores
- minimum score
- parent-child setup with a has_parent query
I then ran a significant terms aggregation on this query with chi_square with custom background filter and not including negatives.
I ran this twice, once where I set the size request parameter to 0. Note this is the size request parameter that is on the same level as the query and aggs, NOT the size parameter on the aggregation itself. The second time, I did not set this size request parameter.
Comparing the two runs, I got significant different results on the significant terms even though I verified that the number of matching documents from the query was the same both times.
Is there a reason why the significant terms aggregation results depends heavily on this upper level size parameter? I was under the impression that this size parameter only affects the number of hits returned to you for matching docs, but doesn't affect the results of the aggregation. I only notice this significant difference on this complicated query I'm doing. I generally don't see a difference on a simpler query. Also, I would've expected that if somehow the size parameter was limiting the document set for the aggregations, that setting the size to 0 would result in no significant terms, but that's not the case either. I'm running this on a 1 shard/1 replica index.