Relative score threshold

I'm not sure I'm thinking about my problem correctly, so please correct my
thinking if necessary.

We're working with census style data, names and addresses. Queries coming
in are for parts of the data and we build boolean queries based on
recognising parts of the query (titles, zip codes, states etc).

Our queries work well in terms of ranking results, and also return a high
number of results because we're using several fields. Within the results we
see, for good searches, a pattern in the scores.

The first result, or few results are good matches and will have high
scores, then after those first few high scoring results the scores drop
dramatically for the remaining (many) results.

What would help us in this situation would be min_score, if the score could
be relative to the the highest score. We prototyped this in by doing two
searches, one to find the highest score, then a second search with a
min_score = highest_score * 0.8 and that works really well for us. What we
get is essentially a result set where the last result is at least 80% of
the score of the best match.

I can't help thinking I might be framing the problem wrong.

Any opinions/thoughts?

rob

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.