Index using the boosted value

ylbaba · August 18, 2010, 6:31am

I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}

The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.

How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.

I'm considering taking every fav as a boolean field.

thanks.

kimchy · August 18, 2010, 7:57am

Just want to understand better, when do you want this weights to come into
play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:

If you want to control it in indexing time, and the favorites control the
"relevancy" of this doc for all type of queries, then you can use the _boost
field feature:
http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.
You can try and use custom_score query (
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :

"custom_score" : {
"query" : {
....
},
"params" : {
"weights" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
"script" : "cscore = score; foreach(fav : doc['favorites'].values()) {
cscore = cscore * weights[fav]; };"
}

Of course, in the above, make sure you set the index to not_analyzed for
the favorites mapping.

-shay.banon

On Wed, Aug 18, 2010 at 9:31 AM, ylbaba ylbaba@gmail.com wrote:

I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}

The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.

How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.

I'm considering taking every fav as a boolean field.

thanks.

ylbaba · August 18, 2010, 12:30pm

Thanks kimchy, this helps me a lot.

Actually the weight comes from user's data. Each user has a different set of
favorites and corresponding weight.
Like:
{
"uid" : "user1",
"favorites" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}

{
"uid" : "user2",
"favorites" : {
"online_game" : 3,
"tech" : 1
}
}

The boost or custom score seems to must be bound to a field, so I take
every favorite
type as explicit field ?

2010/8/18 Shay Banon shay.banon@elasticsearch.com

Just want to understand better, when do you want this weights to come into
play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:

If you want to control it in indexing time, and the favorites control
the "relevancy" of this doc for all type of queries, then you can use the
_boost field feature:
http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.

You can try and use custom_score query (
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :

"custom_score" : {
"query" : {
....
},
"params" : {
"weights" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
"script" : "cscore = score; foreach(fav : doc['favorites'].values()) {
cscore = cscore * weights[fav]; };"
}

Of course, in the above, make sure you set the index to not_analyzed for
the favorites mapping.

-shay.banon

On Wed, Aug 18, 2010 at 9:31 AM, ylbaba ylbaba@gmail.com wrote:

I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}

The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.

How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.

I'm considering taking every fav as a boolean field.

thanks.

kimchy · August 18, 2010, 12:32pm

Both solutions I suggested support this. In the first case, where you use
the boost field, when you index a specific user, you aggregate (in one way
or another) the favorites it has, and create a _boost field to index. In the
second solution, with the custom_score query, the params part is dynamic, so
you can pass different values depending on the user (logged in?) that
queries the data.

-shay.banon

On Wed, Aug 18, 2010 at 3:30 PM, ylbaba ylbaba@gmail.com wrote:

Thanks kimchy, this helps me a lot.

Actually the weight comes from user's data. Each user has a different set
of favorites and corresponding weight.
Like:
{
"uid" : "user1",
"favorites" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}

{
"uid" : "user2",
"favorites" : {
"online_game" : 3,
"tech" : 1
}
}

The boost or custom score seems to must be bound to a field, so I take
every favorite type as explicit field ?

2010/8/18 Shay Banon shay.banon@elasticsearch.com

Just want to understand better, when do you want this weights to come into

play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:

If you want to control it in indexing time, and the favorites control
the "relevancy" of this doc for all type of queries, then you can use the
_boost field feature:
http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.

You can try and use custom_score query (
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :

"custom_score" : {
"query" : {
....
},
"params" : {
"weights" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
"script" : "cscore = score; foreach(fav : doc['favorites'].values()) {
cscore = cscore * weights[fav]; };"
}

Of course, in the above, make sure you set the index to not_analyzed for
the favorites mapping.

-shay.banon

On Wed, Aug 18, 2010 at 9:31 AM, ylbaba ylbaba@gmail.com wrote:

I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}

The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.

How can this be implemented in ES when index the user data so that it can
affect the result score? p.s. The favorites values are fixed, a small number
set.

I'm considering taking every fav as a boolean field.

thanks.

ylbaba · August 18, 2010, 12:58pm

I have the both cases.
The result score comes from a calculate of custom_score query and matching
favorite values.

{
"uid" : "user1",
"favorites" : {"online_game" : 0.6, "tech" : 5, "sports" : 2, "news" : 1.3,
}
}

{
"uid" : "user2",
"favorites" : {"online_game" : 3, "tech" : 1}
}
e.g. search:
user.favorites:online_game^1 AND user.favorites:tech^3

result scores expected:
user1: 0.6 * 1 + 5 * 3 = 15.6
user2: 3 * 1 + 1 * 3 = 6

2010/8/18 Shay Banon shay.banon@elasticsearch.com

Both solutions I suggested support this. In the first case, where you use
the boost field, when you index a specific user, you aggregate (in one way
or another) the favorites it has, and create a _boost field to index. In the
second solution, with the custom_score query, the params part is dynamic, so
you can pass different values depending on the user (logged in?) that
queries the data.

-shay.banon

On Wed, Aug 18, 2010 at 3:30 PM, ylbaba ylbaba@gmail.com wrote:

Thanks kimchy, this helps me a lot.

Actually the weight comes from user's data. Each user has a different set
of favorites and corresponding weight.
Like:
{
"uid" : "user1",
"favorites" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}

{
"uid" : "user2",
"favorites" : {
"online_game" : 3,
"tech" : 1
}
}

The boost or custom score seems to must be bound to a field, so I take
every favorite type as explicit field ?

2010/8/18 Shay Banon shay.banon@elasticsearch.com

Just want to understand better, when do you want this weights to come into

play? When the user explicitly searches for "sports", or when he searches
for "anything" and you want to boost things based on any query based on
the favorites they have? In any case, here are some thoughts:

If you want to control it in indexing time, and the favorites control
the "relevancy" of this doc for all type of queries, then you can use the
_boost field feature:
http://www.elasticsearch.com/docs/elasticsearch/mapping/boost_field/.
Basically, you compute the boosting that you want to give the doc (based on
the favorites you are going to index), and set it as the _boost field in the
indexed document. This solution will result in the fasted execution
possible, though its static on indexing time.

You can try and use custom_score query (
http://www.elasticsearch.com/docs/elasticsearch/rest_api/query_dsl/custom_score_query/),
pass in the params a key (string, favorite) value (the weight) pair, and
within the script, simply multiply the sub query score with the weight.
Note, this might be too slow for you since the script needs to be evaluated
for each hit that matches the query in order to compute the score, you need
to test it. Also, I haven't tested it, so I might have a syntax problem :

"custom_score" : {
"query" : {
....
},
"params" : {
"weights" : {
"sports" : 2,
"news" : 1.3,
"online_game" : 0.6
}
}
"script" : "cscore = score; foreach(fav : doc['favorites'].values())
{ cscore = cscore * weights[fav]; };"
}

Of course, in the above, make sure you set the index to not_analyzed for
the favorites mapping.

-shay.banon

On Wed, Aug 18, 2010 at 9:31 AM, ylbaba ylbaba@gmail.com wrote:

I got sample data below in my hand:
{
"user" : {
"uid" : "user1",
"city" : "ca",
"gender" : "M",
"favorites" : ["sports", "news", "online_game"]
}
}

The problem is that values of field "favorites" already have their own
weight.
Like: W("sports")=2.0, W("news")=1.3, W("online_game")=0.6
means user1 prefers 3 things and likes sports most.

How can this be implemented in ES when index the user data so that it
can affect the result score? p.s. The favorites values are fixed, a small
number set.

I'm considering taking every fav as a boolean field.

thanks.

nabble · August 30, 2010, 10:13am

Hi pal --

I'm tackling a similar issue. have you found out a solution that worked well for you? if so, are you willing to share it with the list?

Thanks!

Topic		Replies	Views
Index using the boosted value Elasticsearch	1	297	July 6, 2017
Boosting at query time based on property value Elasticsearch	7	305	July 6, 2017
Field boosting at query time Elasticsearch	2	377	July 6, 2017
Adjusting Field Boost Based on Indexed Scores in Elasticsearch Elasticsearch	2	24	August 14, 2024
Boosting a document value Elasticsearch	15	1119	July 6, 2017

Index using the boosted value

Related topics