Question about scoring behaviour


(Eric T) #1

Hello,

I'm running a test of my query and mapping shown here:

I'm searching for "pauljones" in the uname field. In the results the fifth
document containing "pauljones10297" has a score of 16.027834, while the
6th document containing "PaulJones" has a score of 5.008698.
Why is the score for the 5th document so much higher than the 6th?

Regards,
Eric

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #2

The difference is the fieldNorm. This field holds any boosts (both document
and field level) and any length normalization. It is only 1 byte, so it is
incredibly lossy. Did you apply an index time boost to either the field or
document?

Have you tried disabling norms on ngram fields? Which version of
elasticsearch are you using? I noticed you used the old format
"omit_norms":true
instead of
"norms": { "enabled": false }

--
Ivan

On Thu, Mar 27, 2014 at 1:28 PM, Eric T ewltang@gmail.com wrote:

Hello,

I'm running a test of my query and mapping shown here:
https://gist.github.com/ewltang/9c00155525784b620ca9

I'm searching for "pauljones" in the uname field. In the results the fifth
document containing "pauljones10297" has a score of 16.027834, while the
6th document containing "PaulJones" has a score of 5.008698.
Why is the score for the 5th document so much higher than the 6th?

Regards,
Eric

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Eric T) #3

Hi Ivan,

No I don't apply any boost at index time.

I did not disable norms on the uname.autocomplete field, I will have to get
back to you on the result. I'm using 0.90.2.

thanks
Eric

On Thu, Mar 27, 2014 at 8:55 PM, Ivan Brusic ivan@brusic.com wrote:

The difference is the fieldNorm. This field holds any boosts (both
document and field level) and any length normalization. It is only 1 byte,
so it is incredibly lossy. Did you apply an index time boost to either the
field or document?

Have you tried disabling norms on ngram fields? Which version of
elasticsearch are you using? I noticed you used the old format
"omit_norms":true
instead of
"norms": { "enabled": false }

--
Ivan

On Thu, Mar 27, 2014 at 1:28 PM, Eric T ewltang@gmail.com wrote:

Hello,

I'm running a test of my query and mapping shown here:
https://gist.github.com/ewltang/9c00155525784b620ca9

I'm searching for "pauljones" in the uname field. In the results the
fifth document containing "pauljones10297" has a score of 16.027834, while
the 6th document containing "PaulJones" has a score of 5.008698.
Why is the score for the 5th document so much higher than the 6th?

Regards,
Eric

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/4LoViXRFa7A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAP43LAsyfAAhy173AYg4QW0%3DZ%2BQAVpu%3DNPHzD4pDwwkX0Ftk3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Eric T) #4

I created a new index that includes both the old "autocomplete" multi-field
and a new multi-field called "autocompletenew" that contains omit_norms :
true

I did the same query on the two fields and the results are here

The scoring is consistent for both but I find the query on the original
field seems to return results that make more sense to me. For example
"PaulJones" is the first result and then followed by PaulJones with one
numerical digit. The second result is more random with "PaulJones" being
second. The rest of the results contain longer variations of PaulJones.

I was expecting that the query on autocompletenew to return the results
that the query on the original field returns. I also didn't expect the
first query to return the results that I want since the multi-field doesn't
have omit_norms: true. Is this the expected behaviour?

On Friday, March 28, 2014 12:18:02 AM UTC-4, Eric T wrote:

Hi Ivan,

No I don't apply any boost at index time.

I did not disable norms on the uname.autocomplete field, I will have to
get back to you on the result. I'm using 0.90.2.

thanks
Eric

On Thu, Mar 27, 2014 at 8:55 PM, Ivan Brusic ivan@brusic.com wrote:

The difference is the fieldNorm. This field holds any boosts (both
document and field level) and any length normalization. It is only 1 byte,
so it is incredibly lossy. Did you apply an index time boost to either the
field or document?

Have you tried disabling norms on ngram fields? Which version of
elasticsearch are you using? I noticed you used the old format
"omit_norms":true
instead of
"norms": { "enabled": false }

--
Ivan

On Thu, Mar 27, 2014 at 1:28 PM, Eric T ewltang@gmail.com wrote:

Hello,

I'm running a test of my query and mapping shown here:
https://gist.github.com/ewltang/9c00155525784b620ca9

I'm searching for "pauljones" in the uname field. In the results the
fifth document containing "pauljones10297" has a score of 16.027834, while
the 6th document containing "PaulJones" has a score of 5.008698.
Why is the score for the 5th document so much higher than the 6th?

Regards,
Eric

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/6dc50a09-1090-463d-b8d0-ac6186789509%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/4LoViXRFa7A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.comhttps://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCo5y0VxaXfE72Fk0SOefnjH1_VXyrqfJoGjbhtywm7SA%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e124bce1-ff32-484d-9c30-3231cb508e96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5