Scoring for a full text search with ngram filter

JeromeAvoustin · December 8, 2016, 4:30pm

Hi there !

I'm actually working on a full text search (and quite new to it).
I'd like to give to the user the results which best fit the query string, and do it as the user types characters.
I've created a multi match query like this:

GET /streams/_search
{
  "query": {
    "multi_match": {
        "query":  "event",
        "fields": [ "*_keywords", "name^2" ]
    }
  }
}

I've also created a custom autocomplete analyzer, exactly like the one described here (except I used a ngram filter instead of edge ngram):
https://www.elastic.co/guide/en/elasticsearch/guide/current/_index_time_search_as_you_type.html

I have declared the analyzer in the index, and used it in the mapping of my document type, as described in the article.
When calling the search as mentioned above, I obtained the results I'm looking for, but I'd like the score to be more accurate.
Let's say I create these documents:

PUT /streams/stream/123
{
  "name": "Event test"
}

PUT /streams/stream/456
{
  "name": "Eventually consistent"
}

PUT /streams/stream/789
{
  "name": "Another Event"
}

And that I'm looking for the term "event", the three documents obtain the same score (either in best fields or in most fields query type). I can understand it.
But I would like to improve the score so that document with the name "Eventually consistent" has a lower score than the others. Previously, I was using only the default analyzer (with no autocomplete possibility) and the score was different depending on the weight of the expression is in the field.

So my question is : How can I have a more accurate score in this situation?

I'm quite new to ES, so I might have missed something important...

Thanks !

dadoonet · December 9, 2016, 8:28am

I'd probably index the same field using different strategies (using multi fields):

standard (where it produces exact terms)
ngrams
whatever...

Then I'd use a bool query with 2 should clauses. The first one with a boost of 3.0 for example would use the "standard" strategy. The second clause would use ngrams but with no boost. Or a boost of 0.5 for example.

Makes sense?

JeromeAvoustin · December 9, 2016, 8:53am

Yes, that might help !
I actually didn't know we could make such a query (I'm really a newbie )
I'll try that!

Thanks David!

JeromeAvoustin · December 9, 2016, 10:44am

So I created the multi-fields using standard and ngram strategies.
I didn't use a bool query, and kept a multi match query like this:

GET /streams/_search
{
  "query": {
    "multi_match": {
        "query":  "event",
        "fields": [ "*_keywords^3", "name^6" , "*_keywords.autocomplete", "name.autocomplete^2"]
    }
  }
}

And I had a very interesting result, getting more accurate scores.
I'll keep on using it for some time, and see how it behaves.

Thanks !

system · January 6, 2017, 10:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Score with ngram filter Elasticsearch	2	343	July 12, 2018
Improve scoring of search results for a multi-field, weighted Elasticsearch query Elasticsearch	1	477	December 16, 2019
nGram filter and relevance score Elasticsearch	3	3609	July 6, 2017
NGRAM Tokens and query_string question Elasticsearch	3	733	May 4, 2017
Partial Match vs Exact Match Scoring with Ngrams Elasticsearch	2	7183	July 5, 2017

Scoring for a full text search with ngram filter

Related topics