Text Search with more preference to prefix matches

Started to work on elasticsearch recently and it has been an interesting
couple of months.

I have Indexed few documents in elasticsearch. Each document has got fields
like username, biography, post, interests etc. I want to do a text search
for 'Smith' in all the indexed fields with the field 'username' starting
with Smith ranking higher than the partial match on the 'username' field
and on other fields like post, interests etc.

I am using grails and I have given higher boost for username but I want the
document with username 'Smith' ranked higher than the document with
username 'Joe Smith' or other fields having prefix or partial matches. How
do i approach this issue? Thanks.

--

You should get this automatically unless you set omit_norms to true in your
mappings.

In general, if you want to boost a certain types of matches without
affecting the set of results that you are getting back, you can use Custom
Filters Score Queryhttp://www.elasticsearch.org/guide/reference/query-dsl/custom-filters-score-query.html
.

On Thursday, January 17, 2013 5:24:37 AM UTC-5, Sara wrote:

Started to work on elasticsearch recently and it has been an interesting
couple of months.

I have Indexed few documents in elasticsearch. Each document has got
fields like username, biography, post, interests etc. I want to do a text
search for 'Smith' in all the indexed fields with the field 'username'
starting with Smith ranking higher than the partial match on the 'username'
field and on other fields like post, interests etc.

I am using grails and I have given higher boost for username but I want
the document with username 'Smith' ranked higher than the document with
username 'Joe Smith' or other fields having prefix or partial matches. How
do i approach this issue? Thanks.

--

On Fri, 2013-01-18 at 09:35 -0800, Igor Motov wrote:

You should get this automatically unless you set omit_norms to true in
your mappings.

Unfortunately, the norm is stored in a single byte, and so loses
granularity when dealing with very short strings like a username. The
difference between "smith" and "john smith" is too small for norms to
make a difference

clint

In general, if you want to boost a certain types of matches without
affecting the set of results that you are getting back, you can use
Custom Filters Score Query.

On Thursday, January 17, 2013 5:24:37 AM UTC-5, Sara wrote:
Started to work on elasticsearch recently and it has been an
interesting couple of months.

    I have Indexed few documents in elasticsearch. Each document
    has got fields like username, biography, post, interests etc.
    I want to do a text search for 'Smith' in all the indexed
    fields with the field 'username' starting with Smith ranking
    higher than the partial match on the 'username' field and on
    other fields like post, interests etc.
    
    
    I am using grails and I have given higher boost for username
    but I want the document with username 'Smith' ranked higher
    than the document with username 'Joe Smith' or other fields
    having prefix or partial matches. How do i approach this
    issue? Thanks.

--

--

Yes, clint is right here, if you have very large index time boost set for
this field, it can really skew the norms. So, you it might be also useful
to switch to query time boosting.

On Friday, January 18, 2013 12:38:04 PM UTC-5, Clinton Gormley wrote:

On Fri, 2013-01-18 at 09:35 -0800, Igor Motov wrote:

You should get this automatically unless you set omit_norms to true in
your mappings.

Unfortunately, the norm is stored in a single byte, and so loses
granularity when dealing with very short strings like a username. The
difference between "smith" and "john smith" is too small for norms to
make a difference

clint

In general, if you want to boost a certain types of matches without
affecting the set of results that you are getting back, you can use
Custom Filters Score Query.

On Thursday, January 17, 2013 5:24:37 AM UTC-5, Sara wrote:
Started to work on elasticsearch recently and it has been an
interesting couple of months.

    I have Indexed few documents in elasticsearch. Each document 
    has got fields like username, biography, post, interests etc. 
    I want to do a text search for 'Smith' in all the indexed 
    fields with the field 'username' starting with Smith ranking 
    higher than the partial match on the 'username' field and on 
    other fields like post, interests etc. 
    
    
    I am using grails and I have given higher boost for username 
    but I want the document with username 'Smith' ranked higher 
    than the document with username 'Joe Smith' or other fields 
    having prefix or partial matches. How do i approach this 
    issue? Thanks. 

--

--

If I understand correctly, you want exact matches on the username field to
be ranked higher than partial matches on username or matches on any other
fields.

What about indexing username twice, using two different tokenizers? One
that splits on whitespace and one that does not. Then if you want to boost
exact matches on username you can use a DisMaxQuery to give a higher boost
when the username is an exact match.

ejs.Request()
.indices(['index1', 'index2'])
.types(['type1', 'type2'])
.query(
ejs.DisMaxQuery()
.queries(ejs.MatchQuery('post', 'Smith'))
.queries(ejs.MatchQuery('biography', 'Smith'))
.queries(ejs.MatchQuery('interests', 'Smith'))
.queries(ejs.MatchQuery('username', 'Smith')
.boost(2))
.queries(ejs.MatchQuery('username_exact', 'Smith')
.boost(4)))

This example can be extended/modified if you want to boost prefix matches
(e.g., by using PrefixQuery or SpanFirstQuery rather than MatchQuery).
Ultimately, I think you'll need to leverage DisMaxQuery.

On Thursday, January 17, 2013 5:24:37 AM UTC-5, Sara wrote:

Started to work on elasticsearch recently and it has been an interesting
couple of months.

I have Indexed few documents in elasticsearch. Each document has got
fields like username, biography, post, interests etc. I want to do a text
search for 'Smith' in all the indexed fields with the field 'username'
starting with Smith ranking higher than the partial match on the 'username'
field and on other fields like post, interests etc.

I am using grails and I have given higher boost for username but I want
the document with username 'Smith' ranked higher than the document with
username 'Joe Smith' or other fields having prefix or partial matches. How
do i approach this issue? Thanks.

--