b: Controls to what degree document length normalizes tf values. The default value is 0.75.
I think this is what I want.
Come back to the "norms" setting approach, seems it not only disable field length norm, but also disable all other normalization factors. So if I just want to disable field length norm, I should use custom similarity with b=0.
The ES 5 documentation does not want to be too specific. Similarity algorithms may choose to use norms very differently, not just field length based norms. But, in ES 5 with BM25, a field length norm is used.
In ES 5 field mapping, you can disable the field norm generation by
"norms": false
You can see this as a decision whether the field should contribute to the Similarity algorithm or not.
You are correct that BM25 not only uses field based normalization but also document based normalization.
Assigning 0 to b is equivalent to avoid the process of normalisation and therefore the document length will not affect the final score. If b takes 1, we will be carrying out a full length normalisation.
The factor b can be interpreted to control the strength of short documents being pushed to the top. 0.0 means do not push at all, 1.0 means full strength push.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.