Should clause gives different weight to each term

Hey,

Iv been having a hard time figuring out why different terms on my should clause has different "weight" without me stating it explicitly, and how can i avoid it.

I have this should clause in my query:

:should=>
[{:term=>{:styles=>{:value=>"Raw", :boost=>2}}},
{:term=>{:styles=>{:value=>"Intimate", :boost=>2}}},
{:term=>{:styles=>{:value=>"Minimalist", :boost=>2}}},
{:term=>{:features=>{:value=>"Graffiti", :boost=>2}}}]}},

my expectation from this is that if a document matches all 4 terms, it will score the highest,
Than if it matches 3 terms , it will score higher the documents that match 2 terms, and so on.

What really happens is:

  1. document that match all 4 terms get the highest score. - GOOD
  2. the "features" term is "stronger" than the "styles" term, because i get these scores for the following documents:

:features=>
["Exposed Brick", "High Ceiling", "Skylight", "Large Windows", "Library", "Dining Table", "Modern Bathroom", "Bathtub", "Concrete", "Wood Floors", "Art",
"Garden", "Plants", "Graffiti", "Sound-proof", "Deck/Patio", "Roof", "Bar",
"View", "Screening Room", "Open Kitchen", "Private Entrance"],
:styles=>["Industrial", "Intimate"],
:score=>7.3219934},

VS

:features=>
["Columns", "Deck/Patio", "Cyc", "Dining Table", "High Ceiling", "Open Kitchen", "Plants", "Private Entrance", "Roof", "White Space", "Breakout rooms", "Empty", "Large Windows", "Screening Room", "Sound-proof", "View", "Concrete"],
:styles=>
["Classic", "Intimate", "Luxurious", "Minimalist", "Modern", "Raw", "Whimsical",
"Industrial"],
:score=>6.400804},

notice how the last document matches 3 terms, but has lower score than the first document that matched 2 terms.

So my questions are:

  1. Can you please explain how can I achieve the following - the more terms met - the higher the score, with each term being equal in weight to the other ( that is : between different fields and in the same field context aswell. - all must have the same weight)

  2. another wierd case is - lets compare two documents :
    doc1: styles: ["Raw", "Intimate"], score: 3.51232
    doc2: styles: ["Raw", "Minimlist"], score: 3.21045

Why those 2 docs get different scores even though they meet the same amount of should-term caulses, and what can i do to make them return the same score?

PS. the boost of 2 that i added was just for testing, the same scenario happens with or without custom boosting

Appreciate your help,

Thanks.

So, i was able to figure it out myself. The reason for the different scoring is https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html

and the solution was changing the mapping of the fields that will affect the score, and sending the similarity: "boolean" paramater with them. that will cause each term that is met to give a score of 1 to the document, which means all scoring terms will have same weight on the score.
https://www.elastic.co/guide/en/elasticsearch/reference/6.3/similarity.html

hope that helps :slight_smile:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.