I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}
The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.
How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.
I'm considering taking every fav as a boolean field.
Just want to understand better, when do you want this weights to come into
play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:
If you want to control it in indexing time, and the favorites control the
"relevancy" of this doc for all type of queries, then you can use the _boost
field feature: http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.
You can try and use custom_score query ( http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :
I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}
The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.
How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.
I'm considering taking every fav as a boolean field.
Actually the weight comes from user's data. Each user has a different set of
favorites and corresponding weight.
Like:
{
"uid" : "user1",
"favorites" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
Just want to understand better, when do you want this weights to come into
play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:
If you want to control it in indexing time, and the favorites control
the "relevancy" of this doc for all type of queries, then you can use the
_boost field feature: http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.
You can try and use custom_score query ( http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :
I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}
The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.
How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.
I'm considering taking every fav as a boolean field.
Both solutions I suggested support this. In the first case, where you use
the boost field, when you index a specific user, you aggregate (in one way
or another) the favorites it has, and create a _boost field to index. In the
second solution, with the custom_score query, the params part is dynamic, so
you can pass different values depending on the user (logged in?) that
queries the data.
Actually the weight comes from user's data. Each user has a different set
of favorites and corresponding weight.
Like:
{
"uid" : "user1",
"favorites" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
Just want to understand better, when do you want this weights to come into
play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:
If you want to control it in indexing time, and the favorites control
the "relevancy" of this doc for all type of queries, then you can use the
_boost field feature: http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.
You can try and use custom_score query ( http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :
I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}
The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.
How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.
I'm considering taking every fav as a boolean field.
Both solutions I suggested support this. In the first case, where you use
the boost field, when you index a specific user, you aggregate (in one way
or another) the favorites it has, and create a _boost field to index. In the
second solution, with the custom_score query, the params part is dynamic, so
you can pass different values depending on the user (logged in?) that
queries the data.
Actually the weight comes from user's data. Each user has a different set
of favorites and corresponding weight.
Like:
{
"uid" : "user1",
"favorites" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
Just want to understand better, when do you want this weights to come into
play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:
If you want to control it in indexing time, and the favorites control
the "relevancy" of this doc for all type of queries, then you can use the
_boost field feature: http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.
You can try and use custom_score query ( http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :
I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}
The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.
How can this be implemented in ES when index the user data so that it
can affect the result score? p.s. The favorites values are fixed, a small
number set.
I'm considering taking every fav as a boolean field.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.