We are currently using ES as our search layer on top of our RDBMS.
The RDBMS conducts a sub-optimal many-to-many join based on matching
id's in two tables (say, Table 1 and Table 2), then the joined
Table1+Table2 results are put into ES for indexing. Table 1 has id,
and "result", and Table 2 has id and "result" - with the "result"
being a numeric value that we need to compare between the tables.
Users generally upload several hundred id's and "results" to Table 1
in bulk, and then need to compare those uploaded values to Table 2 in
NRT. Table 2 is only updated daily.
We were contemplating how we might use ES's (ridiculously) fast
searching to make the join rather than beat up our RDBMS because the
RDBMS join isn't really scaling that well. I've read on other posts
in the group that many-to-many is tricky, but can be done, e.g.:
https://groups.google.com/group/elasticsearch/browse_thread/thread/e61ffcb0421eb654/3793d2420e5f65e8?lnk=gst&q=many-to-many#3793d2420e5f65e8
So here's the question:
Has anybody tried something like the following?:
- Place Table 1 id's and results from the RDBMS into a Multi Get
request - Obtain, from the search results, matching Table 2 id's that are
already indexed in ES - Also return with each Table 2 id and "result" that is found, the
matching Table 1 id and "result" that was submitted in the Multi Get
request.
Item 3 is what we're struggling with. The idea would be to "pass
through" the query parameter (in this case, id & "result" from Table
- to the result set with the specific id and "result" from Table 2
that it matches to -- thus creating a joined result set that could be
indexed into ES.
Are we crazy?
Many thanks in advance for your thoughts.