Ignore term frequency but maintain relevancy

ryans · November 8, 2024, 2:53pm

Here is my problem. If someone searches for "power rangers" (without quotes) on my site, ES is scoring documents with 100 instances of "power rangers" much higher than documents that only contain 1 instance of "power rangers." I understand this to be because of the term frequency piece of relevancy ranking. But in my situation, I do NOT want to give more weight to a document just because it repeats the same content over and over. I want these documents to be scored the same.

I have tried using constant_score which does seem to score these the same, but it has the side effect of scoring EVERY document in the results the same. So stemmed tokens like "power ranger" (singular, without quotes) is scored the same as the exact match "power rangers." This is undesired. I still want documents that match less exactly to be towards the bottom of the results. I just don't want extra weight given to documents with repeated tokens.

I also tried changing my index so that all the text fields and subfields I search have the index_options of "docs", but that didn't seem to work either. I saw no difference with this setting versus the "freqs" or "positions."

Any ideas how to solve this?

Thanks in advance.

Topic		Replies	Views
Ignore term frequency (not releveant for the type of document I'm using) Elasticsearch	5	3449	July 6, 2017
Ignore term frequency but use positions Elasticsearch	1	598	January 15, 2020
Skip Scoring more than once if search term appears multiple time in Document Elasticsearch	2	330	September 15, 2020
Scoring based on existence of all terms even if one term appears multiple times Elasticsearch	2	406	July 5, 2017
How can I do a “match_phrase” that ranks solely on “does the phrase exists”? Elasticsearch	2	429	July 27, 2020

Ignore term frequency but maintain relevancy

Related topics