Relevance in the range 0.0 to 1.0?

Is there a way to score documents so that the relevance score has a fixed
range, like from 0 to 1.0 ? The default scoring can return arbitrarily high
scores, depending on how many times the matching term appears in the
document.

It's tempting to want to normalize the score by the top-matching document,
but this is wrong since the top document isn't always a perfect match.

Are there other built-in scorers, or parameter settings that will do this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dbf1f137-b160-4e30-94b7-1cc9b8fb939e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I think you should read
this Log In - Apache Software Foundation

it might help you to make a point.

simon

On Wednesday, November 5, 2014 8:42:59 PM UTC+1, Dustin Boswell wrote:

Is there a way to score documents so that the relevance score has a fixed
range, like from 0 to 1.0 ? The default scoring can return arbitrarily high
scores, depending on how many times the matching term appears in the
document.

It's tempting to want to normalize the score by the top-matching document,
but this is wrong since the top document isn't always a perfect match.

Are there other built-in scorers, or parameter settings that will do this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6dc24bd4-f96b-46ca-8679-88846bf60064%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Glad to know lots of other people have been asking for it too :slight_smile:

I agree that dividing the default relevance score by some constant (or some
number derived from the results) is a bad idea, for all the reasons that
article describes.

I was hoping there was a non-default scorer that is built to return 0-1.0
scores by design. At my company we have a home-grown search engine that
returns relevance scores in this range, and it works great. (Maybe I could
discuss the algorithm further with the team offline, it's pretty good.)
We're looking to use elasticsearch for some of our applications, and this
feature would help.

I guess I could go down the road of writing a custom scoring algorithm (in
Java?) but not sure how much of an undertaking that is...

On Thursday, November 6, 2014 11:11:23 AM UTC-8, simonw wrote:

I think you should read this
Log In - Apache Software Foundation

it might help you to make a point.

simon

On Wednesday, November 5, 2014 8:42:59 PM UTC+1, Dustin Boswell wrote:

Is there a way to score documents so that the relevance score has a fixed
range, like from 0 to 1.0 ? The default scoring can return arbitrarily high
scores, depending on how many times the matching term appears in the
document.

It's tempting to want to normalize the score by the top-matching
document, but this is wrong since the top document isn't always a perfect
match.

Are there other built-in scorers, or parameter settings that will do this?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c2507f3-a02d-4691-bb6c-0b027bd4e7e2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.