Turn off TF/IDF for searching human names

I have a query that works fairly well for scoring matches in human name and address fields. However, some documents get an undesirably high score. For example, when I search for the string "CHR",
DONALD MCCLURE at the address #41 IROQUOIS/CHICKASAW
gets a higher score than CHRIS LE at the address 2021 FARMINGTON LAKES DR APT
I believe the issue is TF/IDF --> Elasticsearch may be assigning the less relevant document a higher score because the ngram CH is uncommon in the address field.

I am using both ngram and edge_ngram analyzers. I include reproducible code and more details in this stack overflow post... https://stackoverflow.com/questions/62032693/weird-relevance-ranking-in-elasticsearch

Cheers!

I was able to turn off the TF/IDF by overriding the default similarity module like this in my index settings...

 {
    "index": {
    	"similarity": {
    		"default": {
    			"type": "scripted",
    			"script": {
    				"source": "return doc.freq > 0 ? 1 : 0;"
    			}
    		}
    	},.... 
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.