We are building an application that uses ElasticSearch for searching raw data, but then needs to pass the resulting set back to SQL in order to do an aggregation query. This is only feasible with a small result set, as we wouldn't want to have a million-item array in our SQL query.
I thought it might be possible to build a "reduced ES result-set query" based on the result set. Conceptually, here would be an example:
Let's say we search "Terminator" in a financials database that has 10B records. It has the following matches:
- "Terminator" (1M results)
- "Terminator 2" (10M results)
- "Terminators" (18 results)
- "XJ4-227" (1 result ==> Here "Terminator" is in the synopsis of the title)
Instead of passing back the 10+M ids, we'd pass back the following 'reduced query' --
...WHERE name in ('Terminator', 'Terminator 2', 'Terminators', 'XJ4-227')
Do you think this would be a feasible option? And how would we go about implementing this type of result set reducer algorithm? Does ES have any sort of match-metadata that would help us in this? Any help would be greatly appreciated -- thank you!