You cannot set an index-time boost: norms are omitted


(Vinicius Carvalho) #1

Hi there! when combining the index boost in a float field I get exceptions
at the console.

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" :
"false", "boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would have
a default value of 1.0

If a certain document is required to have a boost we can adjust it to a
higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would have
the same elevation value.

I know at least another couple of fields I'll need to do the same (like
popularity, as all docs will begin with a default popularity of 1,0 and it
will increase/decrease as metrics are updated)

Regards


(Clinton Gormley) #2

On Mon, 2012-07-16 at 11:57 -0700, Vinicius Carvalho wrote:

Hi there! when combining the index boost in a float field I get
exceptions at the console.

Please gist a curl recreation of what you are doing

thanks

clint

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" :
"false", "boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would
have a default value of 1.0

If a certain document is required to have a boost we can adjust it to
a higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would
have the same elevation value.

I know at least another couple of fields I'll need to do the same
(like popularity, as all docs will begin with a default popularity of
1,0 and it will increase/decrease as metrics are updated)

Regards


(Michael McCandless) #3

Hi,

This exception ("You cannot set an index-time boost: norms are
omitted") was recently added to Lucene, to close a previous nasty trap
whereby you set boost but omitNorms on a given Field, and think the
boost is working, when in fact it was (previously) silently dropped.
Now you get an exception letting you know the boost won't work.

Here's the Lucene CHANGES entry (in Lucene 3.6.0):

  • LUCENE-3796, SOLR-3241: Throw an exception if you try to set an index-time
    boost on a field that omits norms. Because the index-time boost
    is multiplied into the norm, previously your boost would be
    silently discarded. (Tomás Fernández Löbbe, Hoss Man, Robert Muir)

I think ElasticSearch must turn off norms for a type: float field? I
think instead you should set the boost on your text field(s)?

Mike

http://blog.mikemccandless.com

On Mon, Jul 16, 2012 at 2:57 PM, Vinicius Carvalho
viniciusccarvalho@gmail.com wrote:

Hi there! when combining the index boost in a float field I get exceptions
at the console.

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" : "false",
"boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would have
a default value of 1.0

If a certain document is required to have a boost we can adjust it to a
higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would have
the same elevation value.

I know at least another couple of fields I'll need to do the same (like
popularity, as all docs will begin with a default popularity of 1,0 and it
will increase/decrease as metrics are updated)

Regards


(Vinicius Carvalho) #4

Thanks Mike, I ended up seeing that issue. For now I've disabled that, but
the motivation behind this, was to support something like BoostFunctions
that Solr has. I'd like to boost some documents based on certain numeric
fields they have: populariy, rating, elevation, downloads, mentions,
following. I know that one can do that using BoostingFunctions on Solr, not
sure how to do this using ES.

Regards

On Tuesday, July 17, 2012 4:10:23 PM UTC-4, Michael McCandless wrote:

Hi,

This exception ("You cannot set an index-time boost: norms are
omitted") was recently added to Lucene, to close a previous nasty trap
whereby you set boost but omitNorms on a given Field, and think the
boost is working, when in fact it was (previously) silently dropped.
Now you get an exception letting you know the boost won't work.

Here's the Lucene CHANGES entry (in Lucene 3.6.0):

  • LUCENE-3796, SOLR-3241: Throw an exception if you try to set an
    index-time
    boost on a field that omits norms. Because the index-time boost
    is multiplied into the norm, previously your boost would be
    silently discarded. (Tomás Fernández Löbbe, Hoss Man, Robert Muir)

I think ElasticSearch must turn off norms for a type: float field? I
think instead you should set the boost on your text field(s)?

Mike

http://blog.mikemccandless.com

On Mon, Jul 16, 2012 at 2:57 PM, Vinicius Carvalho
viniciusccarvalho@gmail.com wrote:

Hi there! when combining the index boost in a float field I get
exceptions
at the console.

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" :
"false",
"boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would
have
a default value of 1.0

If a certain document is required to have a boost we can adjust it to a
higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would
have
the same elevation value.

I know at least another couple of fields I'll need to do the same (like
popularity, as all docs will begin with a default popularity of 1,0 and
it
will increase/decrease as metrics are updated)

Regards


(uboness) #5

You might want to look at:

http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/custom-boost-factor-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/boosting-query.html

For custom scoring. The closest equivalent to solr's function queries are the custom_score queries where you can use scripts to define any scoring logic you'd like (of course, the more complex the script, the higher performance cost you pay)

Cheers,
Uri

--
Uri Boness | Founder | ElasticSearch | www.elasticsearch.com | +31 6 4260 8767

On Wednesday, July 18, 2012 at 7:39 PM, Vinicius Carvalho wrote:

Thanks Mike, I ended up seeing that issue. For now I've disabled that, but the motivation behind this, was to support something like BoostFunctions that Solr has. I'd like to boost some documents based on certain numeric fields they have: populariy, rating, elevation, downloads, mentions, following. I know that one can do that using BoostingFunctions on Solr, not sure how to do this using ES.

Regards

On Tuesday, July 17, 2012 4:10:23 PM UTC-4, Michael McCandless wrote:

Hi,

This exception ("You cannot set an index-time boost: norms are
omitted") was recently added to Lucene, to close a previous nasty trap
whereby you set boost but omitNorms on a given Field, and think the
boost is working, when in fact it was (previously) silently dropped.
Now you get an exception letting you know the boost won't work.

Here's the Lucene CHANGES entry (in Lucene 3.6.0):

  • LUCENE-3796, SOLR-3241: Throw an exception if you try to set an index-time
    boost on a field that omits norms. Because the index-time boost
    is multiplied into the norm, previously your boost would be
    silently discarded. (Tomás Fernández Löbbe, Hoss Man, Robert Muir)

I think ElasticSearch must turn off norms for a type: float field? I
think instead you should set the boost on your text field(s)?

Mike

http://blog.mikemccandless.com

On Mon, Jul 16, 2012 at 2:57 PM, Vinicius Carvalho
<viniciusccarvalho@gmail.com (mailto:viniciusccarvalho@gmail.com)> wrote:

Hi there! when combining the index boost in a float field I get exceptions
at the console.

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" : "false",
"boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would have
a default value of 1.0

If a certain document is required to have a boost we can adjust it to a
higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would have
the same elevation value.

I know at least another couple of fields I'll need to do the same (like
popularity, as all docs will begin with a default popularity of 1,0 and it
will increase/decrease as metrics are updated)

Regards


(Vinicius Carvalho) #6

Thanks Uri, that's exactly what I was looking for :slight_smile:

I'm aware of the costs, we are going to do some experimentation first see
if it really worths. One last thing, I think I've seen here in this group
is the capability of changing the score using something fancier. I think it
was mentioned that one could use plugins.

I'm asking this, cause we have a recommendation engine running, and somehow
we would like to have that to influence on the search results (so we could
get a more personalized output), since this maps to a per user
recommendation, we would not be indexing such that, but instead out of the
results from the ES we would like to change the weights of scores based on
results from the recommendation engine.

So, my question is: would it be possible to change Lucene's similarity
calculation in ES?

Regards

On Wednesday, July 18, 2012 2:13:56 PM UTC-4, uboness wrote:

You might want to look at:

http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-query.html

http://www.elasticsearch.org/guide/reference/query-dsl/custom-boost-factor-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/boosting-query.html

For custom scoring. The closest equivalent to solr's function queries are
the custom_score queries where you can use scripts to define any scoring
logic you'd like (of course, the more complex the script, the higher
performance cost you pay)

Cheers,
Uri

--
Uri Boness | Founder | ElasticSearch | www.elasticsearch.com | +31 6 4260
8767

On Wednesday, July 18, 2012 at 7:39 PM, Vinicius Carvalho wrote:

Thanks Mike, I ended up seeing that issue. For now I've disabled that, but
the motivation behind this, was to support something like BoostFunctions
that Solr has. I'd like to boost some documents based on certain numeric
fields they have: populariy, rating, elevation, downloads, mentions,
following. I know that one can do that using BoostingFunctions on Solr, not
sure how to do this using ES.

Regards

On Tuesday, July 17, 2012 4:10:23 PM UTC-4, Michael McCandless wrote:

Hi,

This exception ("You cannot set an index-time boost: norms are
omitted") was recently added to Lucene, to close a previous nasty trap
whereby you set boost but omitNorms on a given Field, and think the
boost is working, when in fact it was (previously) silently dropped.
Now you get an exception letting you know the boost won't work.

Here's the Lucene CHANGES entry (in Lucene 3.6.0):

  • LUCENE-3796, SOLR-3241: Throw an exception if you try to set an
    index-time
    boost on a field that omits norms. Because the index-time boost
    is multiplied into the norm, previously your boost would be
    silently discarded. (Tomás Fernández Löbbe, Hoss Man, Robert Muir)

I think ElasticSearch must turn off norms for a type: float field? I
think instead you should set the boost on your text field(s)?

Mike

http://blog.mikemccandless.com

On Mon, Jul 16, 2012 at 2:57 PM, Vinicius Carvalho
viniciusccarvalho@gmail.com wrote:

Hi there! when combining the index boost in a float field I get
exceptions
at the console.

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" :
"false",
"boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would
have
a default value of 1.0

If a certain document is required to have a boost we can adjust it to a
higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would
have
the same elevation value.

I know at least another couple of fields I'll need to do the same (like
popularity, as all docs will begin with a default popularity of 1,0 and
it
will increase/decrease as metrics are updated)

Regards


(uboness) #7

You can customize similarity by defining your own similarity provider (see https://github.com/tlrx/elasticsearch-custom-similarity-provider)

But it sounds like it's better for your use case to create a plugin that will do the job. This can be done by defining your own native script which basically looks values in the recommendation engine (or an in-memory key/value db) and natively computes the score. For this, you'll need to extend AbstractSearchScript (or one of its abstract sub-classes), define your own NativeScriptFactory, and write a plugin which registers this factory in the ScriptModule. (alternatively, you can skip the plugin part, and configure it in the node's yaml config file).

Read more: http://www.elasticsearch.org/guide/reference/modules/scripting.html

Cheers,
Uri

--
Uri Boness | Founder | ElasticSearch | www.elasticsearch.com | +31 6 4260 8767

On Wednesday, July 18, 2012 at 9:04 PM, Vinicius Carvalho wrote:

Thanks Uri, that's exactly what I was looking for :slight_smile:

I'm aware of the costs, we are going to do some experimentation first see if it really worths. One last thing, I think I've seen here in this group is the capability of changing the score using something fancier. I think it was mentioned that one could use plugins.

I'm asking this, cause we have a recommendation engine running, and somehow we would like to have that to influence on the search results (so we could get a more personalized output), since this maps to a per user recommendation, we would not be indexing such that, but instead out of the results from the ES we would like to change the weights of scores based on results from the recommendation engine.

So, my question is: would it be possible to change Lucene's similarity calculation in ES?

Regards

On Wednesday, July 18, 2012 2:13:56 PM UTC-4, uboness wrote:

You might want to look at:

http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/custom-boost-factor-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/boosting-query.html

For custom scoring. The closest equivalent to solr's function queries are the custom_score queries where you can use scripts to define any scoring logic you'd like (of course, the more complex the script, the higher performance cost you pay)

Cheers,
Uri

--
Uri Boness | Founder | ElasticSearch | www.elasticsearch.com (http://www.elasticsearch.com) | +31 6 4260 8767

On Wednesday, July 18, 2012 at 7:39 PM, Vinicius Carvalho wrote:

Thanks Mike, I ended up seeing that issue. For now I've disabled that, but the motivation behind this, was to support something like BoostFunctions that Solr has. I'd like to boost some documents based on certain numeric fields they have: populariy, rating, elevation, downloads, mentions, following. I know that one can do that using BoostingFunctions on Solr, not sure how to do this using ES.

Regards

On Tuesday, July 17, 2012 4:10:23 PM UTC-4, Michael McCandless wrote:

Hi,

This exception ("You cannot set an index-time boost: norms are
omitted") was recently added to Lucene, to close a previous nasty trap
whereby you set boost but omitNorms on a given Field, and think the
boost is working, when in fact it was (previously) silently dropped.
Now you get an exception letting you know the boost won't work.

Here's the Lucene CHANGES entry (in Lucene 3.6.0):

  • LUCENE-3796, SOLR-3241: Throw an exception if you try to set an index-time
    boost on a field that omits norms. Because the index-time boost
    is multiplied into the norm, previously your boost would be
    silently discarded. (Tomás Fernández Löbbe, Hoss Man, Robert Muir)

I think ElasticSearch must turn off norms for a type: float field? I
think instead you should set the boost on your text field(s)?

Mike

http://blog.mikemccandless.com

On Mon, Jul 16, 2012 at 2:57 PM, Vinicius Carvalho
<viniciusccarvalho@gmail.com (mailto:viniciusccarvalho@gmail.com)> wrote:

Hi there! when combining the index boost in a float field I get exceptions
at the console.

The field mapping is as follows:

"elevation" : {"type" : "float", "store":"yes" , "include_in_all" : "false",
"boost": 3.0},

my idea was to index a property named elevation (to kinda mimic the
elevation component on Solr, but way simpler). So every document would have
a default value of 1.0

If a certain document is required to have a boost we can adjust it to a
higher value (say 1.5) and that doc would get a higher score.

I wonder if the problem here is the fact that initially all docs would have
the same elevation value.

I know at least another couple of fields I'll need to do the same (like
popularity, as all docs will begin with a default popularity of 1,0 and it
will increase/decrease as metrics are updated)

Regards


(Clinton Gormley) #8

On Wed, 2012-07-18 at 20:13 +0200, Uri Boness wrote:

You might want to look at:

http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/custom-boost-factor-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/boosting-query.html

Depending on you requirements, use a custom-filters-score query can be a
very efficient way of manipulating the score:
http://www.elasticsearch.org/guide/reference/query-dsl/custom-filters-score-query.html

It just requires you to be able to express your rules as a filter.

eg for recency scoring, you could have a filter that boosts by 5 if
publish-date < 1 day, by 3 if < 1 week, etc

As an added benefit, the filters are cached, making them run efficiently
when you reuse them later

clint


(system) #9