I have documents with multiple fields, and want to do "term centric" search, instead of "field centric" search. (Following term centric combined_fields for "term centric" definition).
I have two needs:
First, I need to do PHRASE search across fields, but want the score to be "term centric". That is, I don't want to field level IDF.
- "multi-match + phrase" can support phrase, but "multi-match+phrase" is doing "field centric" scoring, make the scoring result bad.
- another way hacky way I tried is creating a query look like:
BOOL MUST "multi-match", type=cross_fields, query="hello world", fields=[...] FILTER "multi-match", type=phrase, query="hello world", fields=[...]
This hacky way can work, but really ugly, and the score is not exactly calculated from PHRASE.
It seemd either "combined_fields" and "multi-match + cross_fields" support PHRASE.
The other thing I want to do, is "bigram boost". Basically, when user search "hello world", I want to find all documents containing "hello" and containing "world", but give more score when "hello world" is consequtive (boost the bigram). And the search is again, across multiple fields.
One way I can think of, is creating a query like:
BOOL MUST "multi-match", type=cross_fields, query="hello world", fields=[...] SHOULD "multi-match", type=PHRASE, query="hello world", fields=[...]
But the "SHOULD" part will calculate score using "field centric" way, and using field level IDF, the score will be unexpected high (or low) depend on the query and field, making the boost becoming uncontrollable.
I really appreciate any suggestion on this.