Adjust score based on the number of matches in Boolean filter


(dragan) #1

Hi,

I have the following setup:
Every document in ES has a field of type string, that contains a list
of
items

basket : ["yuca", "potato"]

and a field of type long called price

price: 626

I'm adjusting the score of my search results based on the difference
between
the price field and the price I search for.
I use CustomScoreQuery and I pass this mvel script to it:

floor(doc.price.value/100)

I am using a Boolean filter to get a partial match on the basket
field. I'm
using pyes (0.16.0), so under the hood the filter looks like this:

'bool':{
'minimum_number_should_match':1,
'should':[
{
'term':{'basket':'yuca'}
},
{
'term':{'basket':'yam'}
}
]
}

Here's what I'm trying to do:
I want to be able to adjust the score when I get a partial match on
the
basket field.

In this example I asked for 'yuca' and 'yam' and the result contains
'yuca'
and 'potato' (so only 'yuca' is a match). I want to penalise the score
by
one point for each missed term.

In this case the price contribution to the score is 6 and the basket
penalty
should be 1, so the overall score for this should be 5.

I prefer to handle this logic in the script that I pass to
CustomScoreQuery,
but I'm open to other solutions as well.

Thank you


(Shay Banon) #2

When you use should clauses in a bool query, the more matches you will have, the higher the score will be, so it should work for you.

On Wednesday, February 15, 2012 at 4:43 PM, Dragan wrote:

Hi,

I have the following setup:
Every document in ES has a field of type string, that contains a list
of
items

basket : ["yuca", "potato"]

and a field of type long called price

price: 626

I'm adjusting the score of my search results based on the difference
between
the price field and the price I search for.
I use CustomScoreQuery and I pass this mvel script to it:

floor(doc.price.value/100)

I am using a Boolean filter to get a partial match on the basket
field. I'm
using pyes (0.16.0), so under the hood the filter looks like this:

'bool':{
'minimum_number_should_match':1,
'should':[
{
'term':{'basket':'yuca'}
},
{
'term':{'basket':'yam'}
}
]
}

Here's what I'm trying to do:
I want to be able to adjust the score when I get a partial match on
the
basket field.

In this example I asked for 'yuca' and 'yam' and the result contains
'yuca'
and 'potato' (so only 'yuca' is a match). I want to penalise the score
by
one point for each missed term.

In this case the price contribution to the score is 6 and the basket
penalty
should be 1, so the overall score for this should be 5.

I prefer to handle this logic in the script that I pass to
CustomScoreQuery,
but I'm open to other solutions as well.

Thank you


(dragan) #3

Thanks Shay, that works great when I want to see the results sorted by
score.

I'm trying to use score as a percentage match metric. Is there a way to set
the score based on the number of matches in the Boolean query? I'd like to
be able to tune the penalty function for imperfect matches.

On Wed, Feb 15, 2012 at 2:43 PM, Shay Banon kimchy@gmail.com wrote:

When you use should clauses in a bool query, the more matches you will
have, the higher the score will be, so it should work for you.

On Wednesday, February 15, 2012 at 4:43 PM, Dragan wrote:

Hi,

I have the following setup:
Every document in ES has a field of type string, that contains a list
of
items

basket : ["yuca", "potato"]

and a field of type long called price

price: 626

I'm adjusting the score of my search results based on the difference
between
the price field and the price I search for.
I use CustomScoreQuery and I pass this mvel script to it:

floor(doc.price.value/100)

I am using a Boolean filter to get a partial match on the basket
field. I'm
using pyes (0.16.0), so under the hood the filter looks like this:

'bool':{
'minimum_number_should_match':1,
'should':[
{
'term':{'basket':'yuca'}
},
{
'term':{'basket':'yam'}
}
]
}

Here's what I'm trying to do:
I want to be able to adjust the score when I get a partial match on
the
basket field.

In this example I asked for 'yuca' and 'yam' and the result contains
'yuca'
and 'potato' (so only 'yuca' is a match). I want to penalise the score
by
one point for each missed term.

In this case the price contribution to the score is 6 and the basket
penalty
should be 1, so the overall score for this should be 5.

I prefer to handle this logic in the script that I pass to
CustomScoreQuery,
but I'm open to other solutions as well.

Thank you


(Shay Banon) #4

No, there isn't such an option. the closest you can get to is the custom_filters_score query (http://www.elasticsearch.org/guide/reference/query-dsl/custom-filters-score-query.html).

On Thursday, February 16, 2012 at 12:21 AM, Dragan Chupacabrovic wrote:

Thanks Shay, that works great when I want to see the results sorted by score.

I'm trying to use score as a percentage match metric. Is there a way to set the score based on the number of matches in the Boolean query? I'd like to be able to tune the penalty function for imperfect matches.

On Wed, Feb 15, 2012 at 2:43 PM, Shay Banon <kimchy@gmail.com (mailto:kimchy@gmail.com)> wrote:

When you use should clauses in a bool query, the more matches you will have, the higher the score will be, so it should work for you.

On Wednesday, February 15, 2012 at 4:43 PM, Dragan wrote:

Hi,

I have the following setup:
Every document in ES has a field of type string, that contains a list
of
items

basket : ["yuca", "potato"]

and a field of type long called price

price: 626

I'm adjusting the score of my search results based on the difference
between
the price field and the price I search for.
I use CustomScoreQuery and I pass this mvel script to it:

floor(doc.price.value/100)

I am using a Boolean filter to get a partial match on the basket
field. I'm
using pyes (0.16.0), so under the hood the filter looks like this:

'bool':{
'minimum_number_should_match':1,
'should':[
{
'term':{'basket':'yuca'}
},
{
'term':{'basket':'yam'}
}
]
}

Here's what I'm trying to do:
I want to be able to adjust the score when I get a partial match on
the
basket field.

In this example I asked for 'yuca' and 'yam' and the result contains
'yuca'
and 'potato' (so only 'yuca' is a match). I want to penalise the score
by
one point for each missed term.

In this case the price contribution to the score is 6 and the basket
penalty
should be 1, so the overall score for this should be 5.

I prefer to handle this logic in the script that I pass to
CustomScoreQuery,
but I'm open to other solutions as well.

Thank you


(alfarid23) #5

Greetings!

I have very similar issue related to number of matches. Is there any
way to access number of matches from custom_filters_score?

Thank you!

On Feb 16, 11:36 am, Shay Banon kim...@gmail.com wrote:

No, there isn't such an option. the closest you can get to is the custom_filters_score query (http://www.elasticsearch.org/guide/reference/query-dsl/custom-filters...).

On Thursday, February 16, 2012 at 12:21 AM, Dragan Chupacabrovic wrote:

Thanks Shay, that works great when I want to see the results sorted by score.

I'm trying to use score as a percentage match metric. Is there a way to set the score based on the number of matches in the Boolean query? I'd like to be able to tune the penalty function for imperfect matches.

On Wed, Feb 15, 2012 at 2:43 PM, Shay Banon <kim...@gmail.com (mailto:kim...@gmail.com)> wrote:

When you use should clauses in a bool query, the more matches you will have, the higher the score will be, so it should work for you.

On Wednesday, February 15, 2012 at 4:43 PM, Dragan wrote:

Hi,

I have the following setup:
Every document in ES has a field of type string, that contains a list
of
items

basket : ["yuca", "potato"]

and a field of type long called price

price: 626

I'm adjusting the score of my search results based on the difference
between
the price field and the price I search for.
I use CustomScoreQuery and I pass this mvel script to it:

floor(doc.price.value/100)

I am using a Boolean filter to get a partial match on the basket
field. I'm
using pyes (0.16.0), so under the hood the filter looks like this:

'bool':{
'minimum_number_should_match':1,
'should':[
{
'term':{'basket':'yuca'}
},
{
'term':{'basket':'yam'}
}
]
}

Here's what I'm trying to do:
I want to be able to adjust the score when I get a partial match on
the
basket field.

In this example I asked for 'yuca' and 'yam' and the result contains
'yuca'
and 'potato' (so only 'yuca' is a match). I want to penalise the score
by
one point for each missed term.

In this case the price contribution to the score is 6 and the basket
penalty
should be 1, so the overall score for this should be 5.

I prefer to handle this logic in the script that I pass to
CustomScoreQuery,
but I'm open to other solutions as well.

Thank you


(system) #6