How does fieldNorm calculated in the example


I saw the explanation of field-norm is: norm(d) = 1 / √numTerms, and in the "put it together" section in the following page:

it gave one example:
doc(and the only doc in the corpus) is "quick brown fox"
query is "fox"

however, the fieldNorm in this case is "0.5". As I understand it, the norm should be 1/√3, as 3 terms are there in the "quick brown fox".

Can someone tell me how this o.5 comes from?

Hi @duo_yi,

You can see the formula directly in the Lucene source code.

If you calculate the value using the formula, you get norm("quick brown fox") = 1 / √3 = 0.57735

However, this would be way to wasteful to store this value in the index. So Lucene uses another trick and reduces the precision. You can see the respective method a few lines below.

If you try to encode and decode the value using the following snippet you get exactly 0.5 (which is a sufficient precision for this purpose):



Hi Denial,

