How does fieldNorm calculated in the example


I saw the explanation of field-norm is: norm(d) = 1 / √numTerms, and in the "put it together" section in the following page:

it gave one example:
doc(and the only doc in the corpus) is "quick brown fox"
query is "fox"

however, the fieldNorm in this case is "0.5". As I understand it, the norm should be 1/√3, as 3 terms are there in the "quick brown fox".

Can someone tell me how this o.5 comes from?

Thanks ahead

Hi @duo_yi,

interesting question. :slight_smile:

You can see the formula directly in the Lucene source code.

If you calculate the value using the formula, you get norm("quick brown fox") = 1 / √3 = 0.57735

However, this would be way to wasteful to store this value in the index. So Lucene uses another trick and reduces the precision. You can see the respective method a few lines below.

If you try to encode and decode the value using the following snippet you get exactly 0.5 (which is a sufficient precision for this purpose):



Hi Denial,

Thanks so much for the reply.

I found the encode-decode explanation as well, but I forgot to withdraw the question.

Still much thanks for your response. Hope it would help others who have same question.

No worries @duo_yi. :slight_smile: