Scoring variable length documents

(slushi) #1

I need to index documents with highly variable lengths. Some may just be a
dozen words while others a few thousand. I am running into issues with
scoring due to length norms. If I leave norms enabled, shorter documents
get scored too highly. If I omit them, longer documents not as relevant are
scored too highly. Is there a clever solution here? I was thinking of
creating 2 separate fields ("short", "long") and omitNorms for the short
field. Then at index time, put the text in one of the 2 fields based on
it's length. It's not perfect by any means, so I was wondering if anyone
else had any experience with this kind of situation.

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
For more options, visit

(system) #2