Issue with ID and scores

Akhilesh_Anb · December 2, 2016, 5:50pm

We are creating an index with documents that contain 1 string each.The process then runs a match query with a certain fuzziness factor on the index.
Now, depending on the ID value for each document, which seems irrelevant, the scoring ends up different.
For example, if we have 100 documents with ID's 1 - 100, we get different scores than if the ID's are 2 - 101.
So when new documents are added to the index, the scoring changes. We would like to know if this is expected behavior for ES -1.6.2.

nik9000 · December 2, 2016, 6:16pm

Have a look at search_type, particularly, dfs_query_then_fetch. _id controls which shard a document is on and by default shard local information is used to compute the score. This isn't usually a big problem if you have lots of documents but it comes up if you have very few. It matters also if you search for very unique things but it usually a big deal unless you have very few documents.

It can also come up if you use routing (which overrides using the _id_ to pick the shard) and create shards that have very different sizes.

That is a fairly old version of Elasticsearch. It is getting "historical" for those of us that work on Elasticsearch every day.

system · December 30, 2016, 6:16pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Odd scoring behavior Elasticsearch	7	500	March 22, 2018
Is _id of document affects on scoring? Elasticsearch	4	508	July 6, 2017
Unexpected Document Scoring Elasticsearch	2	229	July 6, 2017
Identical documents have different scores when using fuzziness Elasticsearch	1	360	July 6, 2017
Why the score in Elasticsearch is different if the data is same in two records Elasticsearch	9	1729	July 5, 2017

Issue with ID and scores

Related topics