What is best way to search a part in one term?


(Ngọc Phạm) #1

Hi, im newbie in elasticsearch and i have a question. Please help me . Thanks you :slight_smile:

I have one term have more letter . Now i want search it with miss some letter or wrong possition. What is best way to do it ? . Fuzy ,wildcard,ngram ...???

example : i have one term like "collaboration". How can search it with "ollabor" or "ration"

(im trying use ngram with each letter is one term,min-max =1. But in my opinion, it not good because this way have so much result not true.)

Thanks for your help !


(Ivan Brusic) #2

Perhaps if you share your current approach, others can comment on where you
potentially might have gone wrong. Are you setting both the min and max
ngram to 1? Can you share your mapping and query?

Using a min ngram of 1 will definitely lead to results that can be
construed as "not true", especially if not using edge ngrams. You are
increasing relevancy at the cost of precision. What is your max ngram
setting? If you are looking only for parts of word, and do not care about
errors (deletions/insertions/transpositions), you probably can achieve
better scoring with only applying ngrams at index time and not at query
time.

Cheers,

Ivan


(Ngọc Phạm) #3

Thanks for your help
it like my example.
i have one term is "collaboration" and "Collaboration"( maybe it is "collaBorAtion" ...")
What is best way to search with key "colla" it appear all ?
Wildcard with colla* ?, i use not analyzed in this case.
ngram i use min-max =1, i search with boolquery "c" "o "l" "l" "a" ? or "ollabor or "ration" .(one letter is one bool query,but it have more result not true, like "cocacola" it can appear =)) )

What is best way ? and have anyway to do it ?


(David Pilato) #4

Ngram is the way to go.

But. You need to use a ngram at index time but a simple analyzer at search time.

If your user search for Collab, then collab won't be split to c, o, l, l, a, b

It will give more accurate results.


(system) #5