Tags with variable boost values

(Adam Zell) #1

I am trying to model a document that has a collection of tags where the tag
boost value depends on the total number of the tags in said document. The
operating assumption is that the more tags a document has, the less boost
each matching tag should have on a query. In this example, let's assume
that a document can have at most 5 tags. For a concrete example, document
1 has two tags: "foo" and "bar". Calculating the boost value for each tag
works as follows in pseudo-Ruby:

  1. Calculate the total number of points for the overall boost in the

total, val = 0, 5

tags.each { |tag|
total += val; val -= 1

For document 1, total = (5 + 4)

  1. Calculate each tag boost given the tag's position and the total number
    of points

val = 5

tags.each { |tag|
tag.boost = val / total; val -= 1

"foo" has a boost of (5 / 9), while "bar" has (4 / 9).

Any other document where the number of tags is != 2 will have different
boost values per tag. Because of this, I don't think query-time boosting
is a good fit.

One option I have thought of is to create 5 new strings in the schema:
tag_1 to tag_5. Each string will contain a tag, and its associated boost
value set at index time. Then a query would match against tag_1 to tag_5
instead of the tags collection. Is there a cleaner way to do this?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

(system) #2