What's the best approach to balance the text similarity with different fields weights?

Morriaty · March 12, 2019, 9:19am

Say we have two docs

{"_id": 1, "title": "James Harden wins the MVP", "content": "xxxxxxxxxxxxxxxxxxxxxx"}

{"_id": 2, "title": "The new 007 movie comes!", "content": "xxxx James Bond xxxxxxxxxxxxx"}

And when users searched query James Bond, we may construct es query like this

GET docs/_search
{
  "query": {
    "multi_match": {
      "query": "james bond",
      "fields": ["title^3", "content"]
    }
  }
}

For the overweight of title field, doc 1 may score better than doc 2.

So my question is what the best approach to make sure doc 2 scores better doc 1.

Thanks for help!

Mark_Harwood · March 12, 2019, 10:09am

Generally, if you blend strict and sloppier interpretations of a user query the docs that match best (strict AND sloppy) will rank higher.

In declining order of strictness:

Phrase query (all terms must match and be next to each other in the text)
AND query (all terms must appear somewhere in the text)
OR query (at least one term must match)
fuzzy query (at least one vaguely reminiscent term must match).

These can all be assembled into a single bool query in the should property.
The more clauses a document matches, the higher the score - the downside is it will be more costly to run.

dadoonet · March 12, 2019, 10:19am

I wrote an example of this in the following gist:

gist.github.com

https://gist.github.com/dadoonet/5179ee72ecbf08f12f53d4bda1b76bab#file-search_kibana_console-txt-L362-L457

search_kibana_console.txt

### REINIT
DELETE user
PUT user
{
  "settings": {
    "number_of_shards": 1
  }, 
  "mappings": {
    "_doc": {
      "properties": {

This file has been truncated. show original

Morriaty · March 22, 2019, 9:18am

Thanks for help!

system · April 19, 2019, 9:18am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to make doc which has more different words score higher? Elasticsearch	2	254	October 20, 2021
Elasticsearch query on multiple fields with custom weight Elasticsearch	2	6210	May 5, 2018
How to increase score for exact word/phrase match in elastic search? Elasticsearch	11	15853	July 9, 2019
Fine-tuning search Elasticsearch	5	338	July 6, 2017
Scores are inconsistent with data and query Elasticsearch	1	197	October 13, 2023

What's the best approach to balance the text similarity with different fields weights?

Related topics