Client side join over multiple fields


(Nathan Gass) #1

I'm trying to simulate something similar to a sql join with multiple fields on the client side. So I have a first query, which has about 1000 results. from this I generate 1000 bool filters which each only match 1-2 documents. I want to do some aggregations over the 1k-2k matches of this large filter in a second query.

This seems to be quite slow considering the small amount of matched documents. So I have the following questions:

I'm assuming that the term filters at the lowest level are calculated for all documents, even though the must combination on the next level means that most of them will not match. Is this correct? Is there any possibility to speedup matching a large amount of complex filters which only a few matching documents each?

A solution with reindexing would be to generate a hash over all the matching fields, so I can use single terms filter with a large amount of terms to achieve the same result. Is this the standard way to approach this problem? Will this help performance?

Another idea is to filter for a correct value in each field separately with a large terms filter, and then have a script filter test for the exact combination of values. Would this be feasible? Can I pass a map of correct combinations to the script as parameter?


(system) #2