Es score from 0 to 1 when finding similar documents to existing one

Is it possible to calculate relative score from 0 to 1 when searching similar documents to existing one?
Need to calculate relative score from 0 to 1 when searching similar documents to existing one. So existing one has score 1, and all other matching documents scores should be calculated according to this]. But existing document should be excluded from the search. Is it possible to do it on elasticsearch side, not just calculating score manually in a programming language like: match_doc_score/search_doc_score

Let's imagine we have index person with mapping:

{
  "properties": {
    "person_id": {
      "type": "keyword"
    },
    "fullname": {
      "type": "text"
    },
    "email": {
      "type": "keyword"
    },
    "phone": {
      "type": "keyword"
    },
    "country_of_birth": {
      "type": "keyword"
    }
  }
}

And I have 3 persons inside the index:
Person 1:

{
  "person_id": 1,
  "fullname": "John Snow",
  "email": "john@gmail.com",
  "phone": "111-11-11",
  "country_of_birth": "Denmark"
}

Person 2:

{
  "person_id": 2,
  "fullname": "Snow John",
  "email": "john@gmail.com",
  "phone": "222-22-22",
  "country_of_birth": "Denmark"
}

Person 3:

{
  "person_id": 3,
  "fullname": "Peter Wislow",
  "email": "peter@gmail.com",
  "phone": "111-11-11",
  "country_of_birth": "Denmark"
}

We find persons that are similar to Person 1 by this query:

{
    "query": {
        "bool": {
            "should": [
                {
                    "match": {
                        "fullname": {
                            "query": "John Snow",
                            "boost": 6
                        }
                    }
                },
                {
                    "term": {
                        "email": {
                            "value": "john@gmail.com",
                            "boost": 5
                        }
                    }
                },
                {
                    "term": {
                        "phone": {
                            "value": "111-11-11",
                            "boost": 4
                        }
                    }
                },
                {
                    "term": {
                        "country_of_birth": {
                            "value": "Denmark",
                            "boost": 2
                        }
                    }
                }
            ],
            "must_not": [
                {
                    "term": {
                        "person_id": 123
                    }
                }
            ]
        }
    }
}

As you can see:

  • person 1 and person 2 match by: fullname, email, country of birth.
  • person 1 and person 3 match by: phone, country of birth.

Is it possible to have 0..1 scoring if we have a document with full match in the index(person 1)?

I know there is a more_like_this query, but in real life search queries can be complicated so more_like_this is not a good option. Even elasticsearch documentation says that if you need more control over the query, then use boolean query combinations.

You might be able to do something with a custom scoring algorithm, that's well outside the scope with what I can help with myself.

Otherwise, nope you cannot.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.