I came across some pretty crazy scoring behavior recently, where
certain matches on a field boosted at index-time had enormously high
field norms. After some illuminating discussion on the #lucene
channel, I tracked it down to this little nugget:
"The boost is multiplied by Document.getBoost() of the document
containing this field. If a document has multiple fields with the same
name, all such values are multiplied together. This product is then
used to compute the norm factor for the field."
So basically the index-time boost you specify is taken to the power of
the number of values in the field!
Since the whole concept of multi-valued field is more or less just
sugar in Lucene, might it make more sense for ES to take care of
concatenating the values in multi-valued fields and passing them as a
single value to Lucene? This would make the index-time boost behavior
better and I don't really see a downside.
Just a thought!