Hello Everyone,
I would like to implement a popularity-based boost in my elasticsearch
engine. I calculate custom popularity boost factors for documents
periodically, but I store these float numbers in a child document, because
I want to avoid the full reindex of the main article documents.
The mapping of the child document is the following:
{
"document_boost": {
"_parent": {
"type": "document"
},
"popular_boost_total": {
"type": "float"
},
"popular_boost_recent": {
"type": "float"
},
"last_updated": {
"type": "date"
}
}
}
I would like to create query that:
- executes the main query provided by the end users
- attach the child document (1-1 relation to the parent)
- boost the score of the main query by multiplying with the custom boost
factors that are read from the child document (popular_boost_total,
popular_boost_recent)
I have been struggling with this for a while, and could not find the real
nice solution. The best solution that I could find is the following
(simplified):
GET index/document/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "basketball"
}
}
],
"should": [
{
"has_child": {
"type": "document_boost",
"query": {
"function_score": {
"script_score": {
"script":
"doc['document_boost.popular_boost_total'].value"
}
}
}
}
}
]
}
}
}
However, this is not a real boost, because the second bool part is an
additional score, not a multiplication on the primary query score! In this
case, the amount of boost cannot be expressed as a clean percentage, but a
noisy additional score and the real boosting factor is depends on the
absolute score value of the particular query. So, I think it is wrong.
I would be able to solve it, if the custom boost factors would not be in
chid documents, but in the parent document fields:
GET index/document/_search
{
"query": {
"function_score": {
"query": {
"match": {
"title": "basketball"
}
},
"script_score": {
"script": "doc['popular_boost_recent'].value"
}
}
}
}
Well, it i obvious, it the above case we do not need the has_child query.
I also tried without the bool query:
GET index/document/_search
{
"query": {
"function_score": {
"query": {
"match": {
"title": "basketball"
}
},
"functions": [
{
"filter" : {
"has_child": {
"type": "document_boost",
"query": {"match_all": {}}
}
},
"script_score": {
"script": "doc['document_boost.popular_boost_recent'].value"
}
}
]
}
}
}
In the above case, the script reads the value from the parent document, not
from the child! Well, anyway, it seems a bug, since I explicitly define the
full qualified name.
I think - considering the possibilities of the query API syntax - the last
query above would be the solution for the real multiplication boosting, but
it simpli does not work.
Another solution can be if I would be able to define the score mode for the
bool query, i.e. to tell elastic search not to add, but multiply the scores
of the parts.
Are there others who are facing with the same issue? I think it is a common
request nowadays to have some kind of popularity and other kind of custom
boosts.
Can somebody give me a hint? I hope I just misunderstood something...
Thanks!
Regards,
Csaba
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/af4a19e4-1b1c-4702-a016-c88a6c76d04b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.