How does Elasticsearch calculate the field-length norm?

Youxu · March 26, 2015, 5:36am

Per this post "theory behind relevance
scoring" http://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html

Elasticsearch calculate the field-length norm as follows:

norm(d) = 1 / √numTerms

But per my testing, seems the actual result value calculated does not meet above formula.

Following is my index docs:

{
"title" : "quick brown fox"
}

{
"title" : "quick fox"
}

Then I query "fox" with following query:
POST /vsmtest/test/_search?explain
{
"query" : {
"match" : {"title":"fox"}
}
}

The result norm value are follows:

doc 1:
{
"value": 0.5,
"description": "fieldNorm(doc=0)"
}
doc 2:
{
"value": 0.625,
"description": "fieldNorm(doc=0)"
}

Can anyone help me understand how does 0.5 and 0.625 calculated per the
formula?
norm(d) = 1 / √numTerms

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11f6dcea-6704-4a56-93f7-21bf89840789%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

masaru · March 27, 2015, 3:02am

Hi,

I believe it's because field norm is encoded in single byte.
See http://lucene.apache.org/core/4_10_2/core/org/apache/lucene/search/similarities/DefaultSimilarity.html

Masaru

On March 26, 2015 at 14:36:45, Xudong You (xudong.you@gmail.com) wrote:

Per this post "theory behind relevance scoring" http://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html

Elasticsearch calculate the field-length norm as follows:
norm(d) = 1 / √numTerms

But per my testing, seems the actual result value calculated does not meet above formula.

Following is my index docs:

{
"title" : "quick brown fox"
}

{
"title" : "quick fox"
}

Then I query "fox" with following query:
POST /vsmtest/test/_search?explain
{
"query" : {
"match" : {"title":"fox"}
}
}

The result norm value are follows:

doc 1:
{
"value": 0.5,
"description": "fieldNorm(doc=0)"
}
doc 2:
{
"value": 0.625,
"description": "fieldNorm(doc=0)"
}

Can anyone help me understand how does 0.5 and 0.625 calculated per the formula?
norm(d) = 1 / √numTerms

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11f6dcea-6704-4a56-93f7-21bf89840789%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.5514c89f.216231b.166%40citra-2.local.
For more options, visit https://groups.google.com/d/optout.

Youxu · March 30, 2015, 3:47am

Thanks Masaru, so it is the precision loss issue by encoding/decoding. That
makes sense.

On Friday, March 27, 2015 at 11:04:12 AM UTC+8, Masaru Hasegawa wrote:

Hi,

I believe it's because field norm is encoded in single byte.
See http://lucene.apache.org/core/4_10_2/core/index.html
DefaultSimilarity (Lucene 4.10.2 API)

http://lucene.apache.org/core/4_10_2/core/index.html

Masaru

On March 26, 2015 at 14:36:45, Xudong You (xudon...@gmail.com
<javascript:>) wrote:

Per this post "theory behind relevance scoring"
Theory Behind Relevance Scoring | Elasticsearch: The Definitive Guide [2.x] | Elastic

Elasticsearch calculate the field-length norm as follows:

norm(d) = 1 / √numTerms

But per my testing, seems the actual result value calculated does not meet above formula.

Following is my index docs:

{
"title" : "quick brown fox"
}

{
"title" : "quick fox"
}

Then I query "fox" with following query:
POST /vsmtest/test/_search?explain
{
"query" : {
"match" : {"title":"fox"}
}
}

The result norm value are follows:

doc 1:
{
"value": 0.5,
"description": "fieldNorm(doc=0)"
}
doc 2:
{
"value": 0.625,
"description": "fieldNorm(doc=0)"
}

Can anyone help me understand how does 0.5 and 0.625 calculated per the
formula?
norm(d) = 1 / √numTerms

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/11f6dcea-6704-4a56-93f7-21bf89840789%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/11f6dcea-6704-4a56-93f7-21bf89840789%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/79aae199-a905-4d3c-a764-e0271b9874d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
How does fieldNorm calculated in the example Elasticsearch	4	1874	July 5, 2017
Field-length norm fails on fields with 3 and 4 words Elasticsearch	2	380	July 6, 2017
fieldNorm & queryNorm in explain api Elasticsearch	3	2919	July 6, 2017
fieldNorm value calculation seems to be wrong Elasticsearch	4	1285	July 6, 2017
Getting the "field length norms" Elasticsearch	1	1075	July 5, 2017

How does Elasticsearch calculate the field-length norm?

Related topics