Scoring variable length documents


(slushi) #1

I need to index documents with highly variable lengths. Some may just be a
dozen words while others a few thousand. I am running into issues with
scoring due to length norms. If I leave norms enabled, shorter documents
get scored too highly. If I omit them, longer documents not as relevant are
scored too highly. Is there a clever solution here? I was thinking of
creating 2 separate fields ("short", "long") and omitNorms for the short
field. Then at index time, put the text in one of the 2 fields based on
it's length. It's not perfect by any means, so I was wondering if anyone
else had any experience with this kind of situation.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #2