Multi Match query with one field of edge n-grams and another field of stemmed terms

Hi All,

Been doing a lot of research and I think I may have a solution, but I’m looking for some “best practice” advise!

I have a document with 2 fields (name and description). Both fields use the standard tokeniser but differ in their filters. The name field uses edge ngrams to provide search as you type results. The description field uses porter stem to reduce terms to their roots. Let’s assume any other filters are common to both fields.

I want to use a multi match query to search across both fields (i.e. a cross fields search). So I need to choose an appropriate search analyser for both fields (to avoid the best/most fields scoring problems).

So do I stem in the search analyser? I guess I have to otherwise I won’t find the stemmed terms in the description field. Is this compatible with edge ngrams for the name field? Logically I think it is because the stemmed term is (probably?) a prefix of the original and will have matches in both fields. Does this differ per language?

An alternate approach might be to use the keyword repeat token filter at search time and OR the terms in the same position:

“I like cats” would become “I AND like AND (cats OR cat)”.

Does Elasticsearch support that kind of query?

Any advise would be great!


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.