I want to allow my users to filter their search results by various tags.
Imagine a WordPress like website where each blog post has a set of tags.
When my users go to the advanced search page, they can select a number of tags to filter the results by. Pages that are not assigned those tags will not be returned in the results.
Right now I have two tables and two indexes:
Table A — the page content (blog post in plain text)
Table B — the tag name and a reference back to Table A (i.e. many to many)
When I do a search (I have preliminary code that already works with Elassandra), I first search Table B for the tags. This is really fast since I can use a term { ... }
search. I get a set of references to pages in Table A. (say one UUID per page)
Next, I do a second search against Table A. This time I use two types of searches:
- A query string with the text the user entered.
- One
term { ... }
per result I found in the first search.
I think that all of that works as expected as it is. I could not see how to do a join otherwise (there are some problems with Elassandra, maybe it will become possible one day with my schema...)
Now, when I search Table B, I get results and each result has a { _score: ... }
, which are ignored when I do the next search. Reading the documentation, I saw that we could use the { boost: ... }
parameter to tweak scores "manually".
What I'm wondering is whether there is a standard way to handle this scenario? I would be interested by existing research documents in that realm if you know of such. Or maybe just a good ol' blog post about filtering in a similar way as mine.
My current idea would be to change the term
by adding the boost
parameter in there:
"term": { "<field-name>": { "value": "<some-UUID", "boost": "<Table B Search _score>" } }
At the same time, I know of the filter
from the bool
query and if I were to add my term
queries in there, I know that the score would be ignored. There may be a reason why it was done that way, i.e. to help with cases like mine?
Thank you.
Alexis