Need help regarding relevancy score sorting in elastic search


### Settings for Indexing ###

import requests
import json 
import logging
settings = {
    'settings': {
        'index': {
            'number_of_shards': 1,
            'number_of_replicas': 1,
    
            
            'similarity': {
                'default': {
                    'type': 'BM25',
                    "b": 0.3,
                    "k1": 0
                }
            }
        }
    },
    
    'mappings': {
       
            'properties': {
                'title': {
                    'type': 'text',
                }
            }
        }
    
}
headers = {'Content-Type': 'application/json'}
response = requests.put('http://localhost:9200/alldocs', data=json.dumps(settings), headers=headers)
response.json()

I am using the above elastic index setting for my search. I am using the BM25 scoring measure here.

Apparently, when I search for the top 20 results, the scores are not sorted. Furthermore, I also see that certain documents that are not in my top-20 results, by means of random sampling, have a better BM25 score (used a different BM25 library).

Can anyone help me figure why is this behavior and how can I resolve this? (Elasticsearch documentation says it sorts all the scores by default)

Could it be because of sharding? But then I have explicitly asked the engine to use a single shard here.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.