Terms order and boosting impact on performance


(Michael Korbakov) #1

Hi everyone!

Is there any way to give documents with a set of keywords different
relevance scores depending on keywords order?
For instance, if I have two documents with keywords like "AAA BBB CCC"
and "CCC DDD EEE", then I'd like to have second document to be scored
higher then first one when searching by "CCC". This represents
situation when most important keywords are entered first and less
important come after them.

Another question that is more of pure curiosity: is it possible to
harm search performance by setting 'boost' parameter in query?
Potentially it could as it involves additional operation during score
computation, but may be someone had a change to compare it in real use
cases?

-- Michael Korbakov


(Shay Banon) #2

This is possible, but requires writing a specialed somehow smart analyzer
and probably query for it. I can't think of another simple way to do it
(now)... . I do something along those lines for the all fields, where
boosted fields retain their boosted information when aggregated into the all
field.

-shay.banon

On Thu, Jul 15, 2010 at 4:16 AM, Mykhailo Korbakov rmihael@gmail.comwrote:

Hi everyone!

Is there any way to give documents with a set of keywords different
relevance scores depending on keywords order?
For instance, if I have two documents with keywords like "AAA BBB CCC"
and "CCC DDD EEE", then I'd like to have second document to be scored
higher then first one when searching by "CCC". This represents
situation when most important keywords are entered first and less
important come after them.

Another question that is more of pure curiosity: is it possible to
harm search performance by setting 'boost' parameter in query?
Potentially it could as it involves additional operation during score
computation, but may be someone had a change to compare it in real use
cases?

-- Michael Korbakov


(Michael Korbakov) #3

I've managed to workaround this problems by adding special unique
token to the beginning of my keywords list and then doing phrase
search with very large slop (tried 1000). So my keyword list now look
in this way: "keywords": ["LONGUNIQUETOKENATLISTSTART", "AAA", "BBB",
"CCC", "DDD"]. After making sloped search with "query":
""LONGUNIQUETOKENATLISTSTART CCC"" I'm getting results with scores
depending on CCC position versus list start.

On Aug 1, 2:18 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

This is possible, but requires writing a specialed somehow smart analyzer
and probably query for it. I can't think of another simple way to do it
(now)... . I do something along those lines for the all fields, where
boosted fields retain their boosted information when aggregated into the all
field.

-shay.banon

On Thu, Jul 15, 2010 at 4:16 AM, Mykhailo Korbakov rmih...@gmail.comwrote:

Hi everyone!

Is there any way to give documents with a set of keywords different
relevance scores depending on keywords order?
For instance, if I have two documents with keywords like "AAA BBB CCC"
and "CCC DDD EEE", then I'd like to have second document to be scored
higher then first one when searching by "CCC". This represents
situation when most important keywords are entered first and less
important come after them.

Another question that is more of pure curiosity: is it possible to
harm search performance by setting 'boost' parameter in query?
Potentially it could as it involves additional operation during score
computation, but may be someone had a change to compare it in real use
cases?

-- Michael Korbakov


(system) #4