Field Norm is always equal to 1


(Rodya Mirov) #1

I have discovered that when doing an ngram search, the fieldnorm quantity is not useful, because it is always equal to one. Is this intentional? How can it be fixed?

In more words: I have configured my index to use an ngram filter (say ngram length 1, so there's not too many tokens). I make a search for the letter "a". I would expect "asian" to come up fairly highly - a relatively large proportion of the tokens from this document (2/5 = 40%) are exactly equal to the search term. However, longer words like "Paragonimiasis", where a smaller proportion of the tokens (3/14 = 21.4%) match the search term, beat it out (in fact "asian" is not even in the top 5000 hits!).

When I run the query with explain on, I see that every field norm is equal to one. I guess this is a side effect of the ngram filter - maybe elasticsearch just assumes there will be too many tokens and sets the fieldnorm to 1 without checking? When I try a different index without the ngram filter, fieldnorm works as intended. However, I do need ngrams. Is there a way to get the fieldnorm behavior to worked as documented, even with an ngram filter?


(system) #2