I want to understand more about how Compound Queries perform. To help frame the problem, my index has 100M records. We have nearly 1M users, who on average have access to about 100 records. So only 0.001% of the data is accessible to each user.
I'm using the
filter portion of the
bool compound query to filter out all the records a user doesn't have access to. So this filter reduces the set of responses by several orders of magnitude. What I'm curious about is in the
must_not portions of the
bool query am I able to use queries that would normally be consider nonoptimal (such as wildcard)?
I think it boils down to a simple question, do all parts of the compound query look at the entire indexed field(s), then intersect the results? Or is there some optimization that comes from first filtering down the dataset, then applying less optimal queries?