My documents are made of categories. There are 40 different categories these are added to the document manually in database and indexed. This is how my document looks like:
{
"name": "..",
"categoryA": "..",
"categoryB": "..",..
"categoryDecayScore": 0.0 - 1.0
}
The documents are considered well covered if they are part of all 40 categories. So to push documents in all categories to the top I wanted to use the decay function to reduce the score of those who are part of less categories.
For this I use the categoryDecayScore
property which is set at index time. If document is part of all 40 categories than it's categoryDecayScore
will be 0.0
if it's missing half but has more than a 1/3 it will get a score of 0.2
and if it has less than 1/3 it will get a score of 0.3
.
Then I also increase categoryDecayScore
by 0.02 for less relavant scores.
What I want to do:
I would like documents who have categoryDecayScore > 0.0
to have their score decayed the farther they are from 0.0
.
This is my filter function:
"filter": {
"exp": {
"categoryDecayScore" : {
"origin" : 0.0,
"scale" : 1.0,
"offset" : 0.0,
"decay" : 0.5
}
}
}
The way I understand documentation here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html
Is that origin
is my point of reference and all documents who have categoryDecayScore > 0.0
will be decayed and any with categoryDecayScore >= 1.0
will be decayed by 0.5
.
However looking at my results it seems this does not take affect. The top 4 documents all have the same score but here are the categoryDecayScore
values:
{
_score: 51.970146,
categoryDecayScore: 0.04
},
{
_score: 51.970146,
categoryDecayScore: 0.2
},
{
_score: 51.970146,
categoryDecayScore: 0.02
},
{
_score: 51.970146,
categoryDecayScore: 0.3
}
Is this normal behaviour or am I understanding the decay function incorrectly. My assumption based on docs is:
- origin: point of reference from which distance is calculated
- scale: upper point after which all documents are decayed by value of decay param
- offset: point after which documents are decayed
- decay: decay amount for all documents scored above or at scale value