I'm currently using a query like this to prioritize hits that contain multiple terms from the query string over those that contain fewer.
"nested": {
"path": "my_nested_field",
"query": {
"bool": {
"should": [
{
"query_string": {
"query": query_str,
"default_operator": "AND"
},
},
{
"query_string": {
"query": query_str,
"boost": 0.5
}}]}}}
I'd like the scoring to reflect:
- Hits that contain the all of the query string terms in the same "my_nested_field"
- Hits that contain the all of the query string terms across all nested "my_nested_field"s in the same document
- Hits that contain fewer than all of the query string terms
I know that BM25 doesn't take into account number of individual terms matched in the document, which makes this a tough problem, but are there any sort of other heuristics to maximize scores for documents whose inner hits contain the highest number of found search terms?