Field-length norm fails on fields with 3 and 4 words


(Fil ES) #1

Hello,

I am experiencing an very annoying behaviour of the elastic search score
calculating algorithm - the field length fails to find a difference between
fields which contain 3 and 4 words. Always return same score for both.
Example:

LANCA HOTEL EXTREME and MASSIVE AMAZING HOTEL GROUP

would come back with the same field length and set the same score for
field-length norm.

I did try using BM25 similarity instead of default one manipulating
parameters, however the output would be always the same.

Anybody got any idea why that would be happening? It is extremely annoying
as most of fields in each document contain about 3-4 words.

Thank you,
Fil

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a007c1fc-a5c4-45f5-9f83-7f414831170b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Ivan Brusic) #2

The field norm is computed at index time and is stored in a single byte,
which can lead to a loss in precision. This behavior might have changed
with newer versions of Lucene, but probably not.

Ivan
On Apr 30, 2015 6:42 PM, "Fil ES" lisowski.filip91@gmail.com wrote:

Hello,

I am experiencing an very annoying behaviour of the elastic search score
calculating algorithm - the field length fails to find a difference between
fields which contain 3 and 4 words. Always return same score for both.
Example:

LANCA HOTEL EXTREME and MASSIVE AMAZING HOTEL GROUP

would come back with the same field length and set the same score for
field-length norm.

I did try using BM25 similarity instead of default one manipulating
parameters, however the output would be always the same.

Anybody got any idea why that would be happening? It is extremely annoying
as most of fields in each document contain about 3-4 words.

Thank you,
Fil

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a007c1fc-a5c4-45f5-9f83-7f414831170b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a007c1fc-a5c4-45f5-9f83-7f414831170b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQA2qwW9RAJ9NM_9kvWzfPkF7qxFHuLZaxGOphj%2BvjLA6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3