Is there any way to give documents with a set of keywords different
relevance scores depending on keywords order?
For instance, if I have two documents with keywords like "AAA BBB CCC"
and "CCC DDD EEE", then I'd like to have second document to be scored
higher then first one when searching by "CCC". This represents
situation when most important keywords are entered first and less
important come after them.
Another question that is more of pure curiosity: is it possible to
harm search performance by setting 'boost' parameter in query?
Potentially it could as it involves additional operation during score
computation, but may be someone had a change to compare it in real use
cases?
This is possible, but requires writing a specialed somehow smart analyzer
and probably query for it. I can't think of another simple way to do it
(now)... . I do something along those lines for the all fields, where
boosted fields retain their boosted information when aggregated into the all
field.
-shay.banon
On Thu, Jul 15, 2010 at 4:16 AM, Mykhailo Korbakov rmihael@gmail.comwrote:
Hi everyone!
Is there any way to give documents with a set of keywords different
relevance scores depending on keywords order?
For instance, if I have two documents with keywords like "AAA BBB CCC"
and "CCC DDD EEE", then I'd like to have second document to be scored
higher then first one when searching by "CCC". This represents
situation when most important keywords are entered first and less
important come after them.
Another question that is more of pure curiosity: is it possible to
harm search performance by setting 'boost' parameter in query?
Potentially it could as it involves additional operation during score
computation, but may be someone had a change to compare it in real use
cases?
I've managed to workaround this problems by adding special unique
token to the beginning of my keywords list and then doing phrase
search with very large slop (tried 1000). So my keyword list now look
in this way: "keywords": ["LONGUNIQUETOKENATLISTSTART", "AAA", "BBB",
"CCC", "DDD"]. After making sloped search with "query":
""LONGUNIQUETOKENATLISTSTART CCC"" I'm getting results with scores
depending on CCC position versus list start.
This is possible, but requires writing a specialed somehow smart analyzer
and probably query for it. I can't think of another simple way to do it
(now)... . I do something along those lines for the all fields, where
boosted fields retain their boosted information when aggregated into the all
field.
-shay.banon
On Thu, Jul 15, 2010 at 4:16 AM, Mykhailo Korbakov rmih...@gmail.comwrote:
Hi everyone!
Is there any way to give documents with a set of keywords different
relevance scores depending on keywords order?
For instance, if I have two documents with keywords like "AAA BBB CCC"
and "CCC DDD EEE", then I'd like to have second document to be scored
higher then first one when searching by "CCC". This represents
situation when most important keywords are entered first and less
important come after them.
Another question that is more of pure curiosity: is it possible to
harm search performance by setting 'boost' parameter in query?
Potentially it could as it involves additional operation during score
computation, but may be someone had a change to compare it in real use
cases?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.