My Elasticsearch scores become inaccurate when using big number differences.
I'm using Elasticsearch 7.10.1.
If I have an index that looks something like this:
[{:index=>
{:_index=>"candidates",
:_id=>"a1786607-e095-4621-bdf9-de2706475614",
:data=>
{:name=>"Carli Stark",
:is_verified=>true, :has_work_permit=>true}}},
{:index=>
{:_index=>"candidates",
:_id=>"57f78d3f-392e-4cdf-a5ff-6d10e7c89d5b",
:data=>
{:name=>"Gayla Keeling",
:is_verified=>false, :has_work_permit=>true}}}]
And then I perform a query like this:
GET candidates/_search
{
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"filter": {
"term": {
"is_verified": true
}
},
"weight": 1000
},
{
"filter": {
"term": {
"has_work_permit": true
}
},
"weight": 100000000000
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
},
"_source": ["is_verified", "has_work_permit"],
"size": 50
}
For some reason, I get exactly the same scores for both documents, which is unexpected:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 9.9999998E10,
"hits" : [
{
"_index" : "candidates_production_20240828114811152",
"_type" : "_doc",
"_id" : "cbd1b70b-f889-4136-a43e-f6782955f58e",
"_score" : 9.9999998E10,
"_source" : {
"is_verified" : false,
"has_work_permit": true
}
},
{
"_index" : "candidates_production_20240828114811152",
"_type" : "_doc",
"_id" : "d644a5e5-09e0-496e-8830-c1a772c46611",
"_score" : 9.9999998E10,
"_source" : {
"is_verified" : true
"has_work_permit": true
}
}
]
}
}
But if I use different weights that are closer, for example 10000 instead of 1000, then the scores are different and results come as expected:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.00000006E11,
"hits" : [
{
"_index" : "candidates_production_20240828114811152",
"_type" : "_doc",
"_id" : "d644a5e5-09e0-496e-8830-c1a772c46611",
"_score" : 1.00000006E11,
"_source" : {
"is_verified" : true
"has_work_permit": true
}
},
{
"_index" : "candidates_production_20240828114811152",
"_type" : "_doc",
"_id" : "cbd1b70b-f889-4136-a43e-f6782955f58e",
"_score" : 9.9999998E10,
"_source" : {
"is_verified" : false
"has_work_permit": true
}
}
]
}
}
Any ideas of why does this happen? And is there a way to fix it?