Custom document scoring with a secondary 'penalization' query

ChapterSevenSeeds · March 16, 2024, 5:00pm

I recently heard about a rather unique method that a company is using to rank documents from a query. First, the queries that this company uses control the scoring down to a T by replacing the document scores with the individual scores returned by a function_score query. Each function in the function_score query queries against an individual field. Each field has a predefined weight that is used to replace the document score that Elasticsearch assigns to it. These resulting scores from each function are then summed.

That's all fine and dandy. However, the piece that I don't understand is a second query that is performed after the initial query. This second query acts as a 'penalization' query and queries the resulting documents from the initial query for fields that don't match the desired data. The more fields don't match the desired data, the higher the score. Then, outside Elasticsearch, the company takes the results from the two queries and subtracts the second query's score from the first query's score. It then filters out any documents that don't match a certain minimum score.

So, given what I know about Elasticsearch, I am completely convinced that this second 'penalization' query is redundant and any scoring differences that result from subtracting one score from another can be merged into one query by tweaking the function_score weights and whatnot. However, I have no way to formally prove this. Am I right in assuming that the second 'penalization' query is redundant?

In addition, if summing function_score query functions and using that to replace the score for each document is not the best approach, what would you all recommend?

carly.richmond · March 18, 2024, 10:38am

Hi @ChapterSevenSeeds,

Welcome to the community! Did you come across this approach in a blog or resource that you can share?

ChapterSevenSeeds · March 18, 2024, 4:08pm

Hi @carly.richmond!

I did not come across this approach anywhere in a publicly available resource. I just stumbled across it when I did some work for the company I mentioned above.

system · April 15, 2024, 4:08pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Expecting another result(scoring) on function_score Elasticsearch	2	413	October 23, 2018
Multiple score fields? Elasticsearch	2	505	February 2, 2023
Function_score with order Elasticsearch	8	4905	February 13, 2019
Just pushed: custom_score query Elasticsearch	6	420	July 6, 2017
Composing function scores Elasticsearch	1	384	July 6, 2017

Custom document scoring with a secondary 'penalization' query

Related topics