I have a large (tens of billions of docs) index spread across a bunch of machines. I'd like to constrain a query to a subset of that index. Example: each doc might be a message with an author, and I want to constrain a search to messages written by a subset of all authors.
The catch is that this subset may still be a million people, so clearly a boolean query isn't going to work.
I'm wondering if I can do this with a custom plugin. I could probably do something with a custom scorer, but that doesn't feel very efficient. I'd need to keep a cache of docId -> authourId in memory for this to work at a reasonable speed.
Am I missing something obvious?