I have two records with same book titles.
PUT books/book/2
{
"title" : "Nineteen Eighty-Four",
"reader_rating" : 4.0,
"author" : "John Steinbeck"
}
PUT books/book/3
{
"title" : "Nineteen Eighty-Four",
"reader_rating" : 4.5,
"author" : "Joseph Heller"
}
When I query by title, I got different scores on those two records. Can anyone explain why their score are different scores? I only have one node .
GET /_search
{
"query": {
"match" : {
"title" : "Nineteen Eighty-Four"
}
}
}
"hits": [
{
"_index": "books",
"_type": "book",
"_id": "2",
"_score": 1.4820051,
"_source": {
"title": "Nineteen Eighty-Four",
"reader_rating": 4,
"author": "John Steinbeck"
}
},
{
"_index": "books",
"_type": "book",
"_id": "3",
"_score": 0.7594807,
"_source": {
"title": "Nineteen Eighty-Four",
"reader_rating": 4.5,
"author": "Joseph Heller"
}
}]
Thanks a lot.
Ivan
(Ivan Brusic)
November 1, 2017, 9:32pm
2
You have only one node, but how many shards? IDF values are per shard, so
you might be experiencing different IDF values for the terms.
You can always enable explanations and see where the different is:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-explain.html
Another solution is to use a distributed scoring mode and see if the scores
differ. If they are the same, then it is the IDF value that is causing the
discrepancy:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-request-search-type.html#dfs-query-then-fetch
Cheers,
Ivan
Thanks Ivan.
True I have five shards on my node. If I use one shard only, their scores are the same.
Bing
Ivan
(Ivan Brusic)
November 2, 2017, 7:15pm
4
Common cause for discrepancies while testing. You can enable distributed
term frequencies (only a slight performance hit), but in general, IDF
values normalize as you add more non-trivial content. If you have only a
handful of test documents, then IDF values will almost always be different
(with the default 5 shards).
system
(system)
Closed
November 30, 2017, 7:15pm
5
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.