Question about boost and scoring


(Evgeniy Galkin) #1

Hello.
I have the next problem.
I have n types of objects with some text fields in each.
I want to give some priorities to these fields. So in search results i
should see at first only matches from the most prior field, then from
the next and so on.

Example:
Let's imagine we have 2 types: type_a and type_b. We have fields
type_a.f1, type_a.f2 and type_b.f1.
So I want to see in my search results at first only matches from
type_a.f1, next from type_a.f2 and at last from the type_b.f1.

I've experimented with boost and discovered the next things.

  1. Usualy _score value for the document without boosting is between
    0.1 and 10.
  2. As far as I understand, _score of boosted document is boost *
    _score.
  3. If I'm right in my 2 previous statements, the boost value for more
    prior document should be at least 100 times more than for less prior
    one to guarantee priority sorting. It means that I should have boost =
    1 for type_b.f1 (and its score should be between 0.1 and 10), boost =
    100 for type_a.f2 (and score between 10 and 1000) and boost = 10000
    for type_a.f1 (score between 1000 and 100000).
    But if I have 4 types with 4 fields in each, boost value for the most
    prior field should be 10^15.

Is it normal or there is another way to order matches by the priority
of the type and field? What is the numeric range of boost value?

P.S. Yes, I can make 4 or 16 queries instead of one (one query per
field), but it's not convenient for me because I show all results as a
single page with pagination (using "size" and "from" API parameters)

P.S. 2. Sorry for my bad English, I hope that you understand me.


(Shay Banon) #2

Do you set boosting while indexing? Its a bit problematic since it
looses precision, maybe you can boost while querying?

On Fri, Apr 27, 2012 at 7:57 PM, Evgeniy Galkin evgeniy@parkflyer.ruwrote:

Hello.
I have the next problem.
I have n types of objects with some text fields in each.
I want to give some priorities to these fields. So in search results i
should see at first only matches from the most prior field, then from
the next and so on.

Example:
Let's imagine we have 2 types: type_a and type_b. We have fields
type_a.f1, type_a.f2 and type_b.f1.
So I want to see in my search results at first only matches from
type_a.f1, next from type_a.f2 and at last from the type_b.f1.

I've experimented with boost and discovered the next things.

  1. Usualy _score value for the document without boosting is between
    0.1 and 10.
  2. As far as I understand, _score of boosted document is boost *
    _score.
  3. If I'm right in my 2 previous statements, the boost value for more
    prior document should be at least 100 times more than for less prior
    one to guarantee priority sorting. It means that I should have boost =
    1 for type_b.f1 (and its score should be between 0.1 and 10), boost =
    100 for type_a.f2 (and score between 10 and 1000) and boost = 10000
    for type_a.f1 (score between 1000 and 100000).
    But if I have 4 types with 4 fields in each, boost value for the most
    prior field should be 10^15.

Is it normal or there is another way to order matches by the priority
of the type and field? What is the numeric range of boost value?

P.S. Yes, I can make 4 or 16 queries instead of one (one query per
field), but it's not convenient for me because I show all results as a
single page with pagination (using "size" and "from" API parameters)

P.S. 2. Sorry for my bad English, I hope that you understand me.


(system) #3