Elasticsearch Scoring Inconsistency

powerte · December 11, 2021, 12:30am

Our API uses Elasticsearch to return results based on scoring (relevance). In our use case, it's important we consistently return documents in the same sort order. We currently see that queries with larger result sets (approx. > 500 documents) return inconsistent scoring on successive runs despite no changes being made to the Elasticsearch indices.

The elasticsearch documentation suggests scores are not reproducible and "The recommended way to work around this issue is to use a string that identifies the user that is logged in (a user id or session id for instance) as a preference. This ensures that all queries of a given user are always going to hit the same shards, so scores remain more consistent across queries."

However, despite using something like preference: foo and search_type: dfs_query_then_fetch in the query, we're still receiving inconsistent scoring, and because of this our API results are not ordered deterministically from request to request.

The cluster we're working with is relatively simple. It has two nodes--the primary shard for the index in question lives on node A and the replica lives on node B. When we specify a _prefer_nodes setting or the now-deprecated _primary_first in the preference query, we seem to receive the consistent scoring/sort-ordering we're looking for.

We would expect that using the documentation-prescribed approach of preference: <arbitrary_string> would resolve the scoring inconsistency, and we'd prefer not having to layer on application-level logic for detecting which nodes have serviced queries with particular parameters and then specifying the node that has historically served a request using _prefer_nodes in order to get the consistent sort order.

Can someone help us to better understand why preference isn't working for us in the way we expect and if there's a more generally-accepted way of achieving consistent sort order via query definition or cluster configuration?

system · January 8, 2022, 12:31am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Inconsistent results while querying on a index Elasticsearch	10	8400	July 5, 2017
Inconsistent results (Preference = Custom (string) UserId) Elasticsearch	5	1144	July 6, 2017
Same query returning different result each time Elasticsearch	6	95	August 21, 2024
Elasticsearch response order consistency issues Elasticsearch	17	1449	September 21, 2021
Inconsistent result order (same score) Elasticsearch	1	563	December 13, 2017

Elasticsearch Scoring Inconsistency

Related topics