Increasing relevance with additional matching terms matching, but with constant scores


(Jorge T) #1

Hi all,

I am working on an autocomplete implementation, and am trying to do the
following:

Given a user query: minneapolis hotel ivy

I want to be able to query one or more fields, and boost the relevancy for
each term that matches. Many are already jumping up and saying "the match
query already does that!", but the catch is, we rely heavily on popularity
for sorting, so 3 terms matching should always give the same base level of
relevance. Using match, the term frequencies come into play, and the score
for "minneapolis hotel ivy" will be different from "boston hotel ivy", with
3 terms matching in both, which is not what we want.

A simple solution is to tokenize these terms in the calling code and use a
bool query with match clauses and constant score. For example:

{
"bool": {
"should": [
{ "constant_score": {"query": "term": { "name": "minneapolis"}},
"boost" : 1},
{ "constant_score": {"query": "term": { "name": "hotel"}}, "boost" :
1},
{ "constant_score": {"query": "term": { "name": "ivy"}}, "boost" :
1},
]
}
}

This is fine for English and most western langauges, but becomes more
complicated and undesirable in other languages, primarily Chinese,
Japanese, Korean, etc. I want to be able to use custom tokenizers there,
and just leave the dirty work to ES.

I'm fairly certain there isn't an easy solution here, but just wanted to
double check. Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/32fe114c-0f76-49cb-8e8b-548c8bb450eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2