How to boost higher ngrams?

xxviktor · March 27, 2019, 3:13pm

Hi,
I am working on index where I need to provide results even for one letter search. Default tokenizer is currently set to ngram with range from 1 to 5. My search query is running across multiple fields and for some entries it scores higher documents that have matches on multiple fields with bunch of unigrams and bigrams before higher ngrams (3,4,5). It is often with little margin but still, it is higher.

Character of my search is that when looking for "eti" it is much better to get higher results like "blablaetiblabla" than "blaeibtiblaibe" on multiple fields (it is hard to provide example, since the margin is often low and it works only for some combination of data).

How to deal with this problem? I can use fields with shingles as well, but than I can boost only whole words. I was thinking about having default tokenizer with ngrams 1-2 and second field with ngrams 3-5 with boosting second one in query, but this seems kind a fishy to me. Best would be to have >=3 minimum ngram tokenizer with the ability to search with one or two characters (if no results for 3 and higher has been found), but I am not sure if it is even possible.

system · April 24, 2019, 3:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Edge ngram boosting Elasticsearch	1	893	July 5, 2017
Search-as-you-type Optimization: How to boost the First N-Gram for More Accurate Results Elasticsearch	0	11	March 1, 2025
Pre boost while indexing Elasticsearch	1	494	April 3, 2018
Ngram and edgeNgram combined for _all field; or different token filters per field for _all Elasticsearch	1	585	July 6, 2017
Ngram and score in query Elasticsearch	3	1549	August 28, 2017

How to boost higher ngrams?

Related topics