Document's Field level boosting


(freddj) #1

Hi,

From an irc discussion, I understood (sorry if I'm wrong) that
Document's Field level boosting (as described there
http://lucene.apache.org/java/3_0_0/scoring.html and
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/document/Fieldable.html#setBoost(float)
) is NOT available currently. The problem seems to be that it's not
easy to have a nice API over json.
Solr does not have this issue since their XML api permits to decorate
any field with arbitrary meta data. So, to set a boost on a specific
field, on a specific document, they just do:

<field name="office" boost="2.0">Bridgewater</field>

(from http://wiki.apache.org/solr/UpdateXmlMessages)

We are in the middle of switching from solr to elasticsearch but we
rely on this missing feature. It would help a lot to have it and it
might prevent the switch altogether.

Apart from the API issue, the feature does not look difficult to
implement.
Can we start discussing API there?

Here is my 2 cents proposal:
In the mapping, specify that a field is of string type.
{
"office" : {
"type: "string"
}
}

And to specify the field boost in a document, do it this way:

{
"office" : {
"_value": "Bridgewater",
"_boost": 2.0
}
}

What do you think of this?

Thanks


(Shay Banon) #2

Sounds like a good format, but it will only work if the object/field was explicitly mapped as a string, otherwise, boost will be indexed as a field. Care to open an issue for this?
On Monday, May 9, 2011 at 7:44 PM, freddj wrote:

Hi,

From an irc discussion, I understood (sorry if I'm wrong) that
Document's Field level boosting (as described there
http://lucene.apache.org/java/3_0_0/scoring.html and
http://lucene.apache.org/java/3_0_0/api/core/org/apache/lucene/document/Fieldable.html#setBoost(float)
) is NOT available currently. The problem seems to be that it's not
easy to have a nice API over json.
Solr does not have this issue since their XML api permits to decorate
any field with arbitrary meta data. So, to set a boost on a specific
field, on a specific document, they just do:

Bridgewater
(from http://wiki.apache.org/solr/UpdateXmlMessages)

We are in the middle of switching from solr to elasticsearch but we
rely on this missing feature. It would help a lot to have it and it
might prevent the switch altogether.

Apart from the API issue, the feature does not look difficult to
implement.
Can we start discussing API there?

Here is my 2 cents proposal:
In the mapping, specify that a field is of string type.
{
"office" : {
"type: "string"
}
}

And to specify the field boost in a document, do it this way:

{
"office" : {
"_value": "Bridgewater",
"_boost": 2.0
}
}

What do you think of this?

Thanks


(freddj) #3

sure, here it is:


(Michel Conrad) #4

Hi,

I was wondering if I could use field level boosts in order to
implement different ranked results from search results.

If I have different criteria for ranking my documents at index time,
my initial idea was to assign a field boost with the ranking
information to different fields for the different ranking
possibilities.
At search time I would then have the possibility to select different
boosts by selecting different fields to include in the query. Is this
the right way to do it, and is it possibile to do it with
elasticsearch
at the moment?

Thanks,
Michel

On Mon, May 9, 2011 at 10:08 PM, freddj fastphilg@gmail.com wrote:

sure, here it is:
https://github.com/elasticsearch/elasticsearch/issues/920


(Shay Banon) #5

Yes, this is certainly possible with how elasticsearch works now, though, index time boosting will use less resources.
On Tuesday, May 10, 2011 at 4:16 PM, Michel Conrad wrote:

Hi,

I was wondering if I could use field level boosts in order to
implement different ranked results from search results.

If I have different criteria for ranking my documents at index time,
my initial idea was to assign a field boost with the ranking
information to different fields for the different ranking
possibilities.
At search time I would then have the possibility to select different
boosts by selecting different fields to include in the query. Is this
the right way to do it, and is it possibile to do it with
elasticsearch
at the moment?

Thanks,
Michel

On Mon, May 9, 2011 at 10:08 PM, freddj fastphilg@gmail.com wrote:

sure, here it is:
https://github.com/elasticsearch/elasticsearch/issues/920


(system) #6