Boost relevance score based on word proximity in Elasticsearch

ashit_pupu · June 1, 2017, 10:40am

I am working in a project where I need to provide boost based on Proximity in Elasticsearch. The Requirement states that let's say we have a field called statement in index so doc 1 has following value in statement

 "statement":["this is a dog",
              "it is brown in colour"
              "it is very fluffy"]

and doc 2 has statement as :-

 "statement":["this is a brown fluffy dog",
              "it plays in the garden"]

Let's say I do a query of "beautifull fluffy dog" then the result I am getting is both doc 1 and 2 with both having same relevance. But what I have to achieve is the doc 2 should come with higher relevance than doc 1 because in doc 2 the first statement is having fluffy and dog is same sentence, whereas in doc 1 its scattered over the values of statement. I am using phrase query with slop as 10 and have "position_increment_gap"as 100 in mapping. Using ES version 2.3.0

jpountz · June 5, 2017, 8:12am

We don't have queries that both boost on proximity and allow some query terms to be missing. If document 1 had beautiful amoung its terms, should it rank better than doc 2 because it contains all query terms, or worse because document 2 has better proximity for 2 of the query terms?

You could try to use SHOULD clauses in order to boost the score of documents based on proximity of adjacent query terms:

GET index/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "statement": "beautiful fluffy dog"
          }
        }
      ],
      "should": [
        {
          "match_phrase": {
            "statement": {
              "query": "beautiful fluffy",
              "slop": 10
            }
          }
        },
        {
          "match_phrase": {
            "statement": {
              "query": "fluffy dog",
              "slop": 10
            }
          }
        }
      ]
    }
  }
}

Or alternatively use rescoring if it slows down your queries too much.

system · July 3, 2017, 8:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Query by keywords and phrase. Performance question Elasticsearch	4	872	July 6, 2017
Boosting certain words/phrases in a should query? Elasticsearch	1	478	June 19, 2020
Prioritising matches in specific field Elasticsearch	4	3587	July 5, 2017
spanNear queries Vs phrase match query in elasticsearch Elasticsearch	2	641	March 5, 2019
Boost relevance by numeric field (the greater the value, more relevant it is) Elasticsearch	1	512	September 9, 2019

Boost relevance score based on word proximity in Elasticsearch

Related topics