2). I'm setting minimal score 1 to throw out long-tail.
When I search for let's say 'first thir' it works like a charm. But if I
try searching for 'first th', there can be items with exact 'th' word in
it, so relevance is broken and every result has very low score. What I need
to do is to throw out items like 'th', 'third' etc, and boost score for
items that contains both 'first' and words starting with 'th'. Matching
exactly by whole title 'first th' wouldn't help because I might need items
like 'first second third'. I tried playing with slop, but it doesn't solve
the problem globally.
2). I'm setting minimal score 1 to throw out long-tail.
When I search for let's say 'first thir' it works like a charm. But if I
try searching for 'first th', there can be items with exact 'th' word in
it, so relevance is broken and every result has very low score. What I need
to do is to throw out items like 'th', 'third' etc, and boost score for
items that contains both 'first' and words starting with 'th'. Matching
exactly by whole title 'first th' wouldn't help because I might need items
like 'first second third'. I tried playing with slop, but it doesn't solve
the problem globally.
2). I'm setting minimal score 1 to throw out long-tail.
When I search for let's say 'first thir' it works like a charm. But if I
try searching for 'first th', there can be items with exact 'th' word in
it, so relevance is broken and every result has very low score. What I need
to do is to throw out items like 'th', 'third' etc, and boost score for
items that contains both 'first' and words starting with 'th'. Matching
exactly by whole title 'first th' wouldn't help because I might need items
like 'first second third'. I tried playing with slop, but it doesn't solve
the problem globally.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.