Rescoring documents based on Author occurrence

Charles_Lariviere · November 10, 2020, 3:59pm

Hey!

We have an index with documents that look like so:

{
    "id": "1",
    "author_id": "8",
    "popularity": 2.5,
    "tags": ["illustration", "book", "image"]
}

We search this index for documents matching given tags and sort them using a script_score (i.e. using popularity and other variables). In some cases though, almost all top ranking documents are from the same authorwhich is undesirable.

Therefore, we're looking for a way to decrease a document's score given it's author-occurrence-count, such that given the following results:

[
    {"id": "1", "author_id": 8, ...}, # author-occurrence-count: 1
    {"id": "5", "author_id": 8, ...}, # author-occurrence-count: 2
    {"id": "7": "author_id": 7, ...}, # author-occurrence-count: 1,
    {"id": "3", "author_id": 8, ...}, # author-occurrence-count: 3
    ...
]

We could re-score the result set to include the author-occurrence-count in the script_score and apply a monotonically decreasing function. The real unknown is how can we get the Elasticsearch query to return this type of variable such that it is available in a rescoring query -- and whether this is even possible!

Thanks!
Charles

system · December 8, 2020, 4:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Decay score based on number occurrences Elasticsearch	4	1020	July 6, 2017
Rescoring based on values in result set Elasticsearch	1	367	April 24, 2018
Rescore and track_scores Elasticsearch	2	1066	July 6, 2017
Limit script_score rescoring to the top query matches Elasticsearch	4	852	November 5, 2020
Custom relevancy Elasticsearch	1	391	July 31, 2017

Rescoring documents based on Author occurrence

Related topics