What is elasticsearch best feature to achieve what I need?

Hi, I'm relatively new to elasticsearch and I'm not sure what is the best approach to achieve what I want, but this is what I've been trying:

I have a list of dog breeds in an Index and given the input "0293 Corki USA" to elasticsearch I want to get the breed "Corki" in return, just one best match.

  1. The input breed can have multiple words. E.g German Shepard.
  2. There is the possibility of having some trash in the input, like "0293".
  3. It is possible that there are misspelled inputs. E.g. Corik instead of Corki.
  4. I want to avoid false positives as much as possible. For example Input "English Terrier" and get "English Setter" as result.

I've been focusing on using Match Phrase Query and getting mixed results. Right now I'm exploring Phrase Suggester and try to use the suggestions as possible results.

Do I have better options to accomplish this?

1 Like

@Nuno_Oliveira Finding relevant results is quite complex topic and requires a lot of experimentation. I found Relevant Search book by Doug Turnbull to be very useful.

1 Like

Hi @Nuno_Oliveira Welcome to the community.

In addition to @mayya 's excellent suggestion if you are new to search in general perhaps you should take a look at Elastic App Search (there are free / basic version). There is some pretty powerful search and relevance tuning capabilities that are fairly easy to use through an interface it might provide a good way to learn about it and meet your objectives.

Here is a quick start training to App Search to get you up and running quickly.

If you are good with just starting with Docs here are the App Search Docs

1 Like

A common technique is to stack multiple interpretations of the user’s search input string in different query clauses within one request to elasticsearch. Only one clause has to match but the more that match the better. Each clause differs in terms of strictness and a boost can be given to the clauses to make stricter clause matches rank the highest. A “bool” query is used to contain the various clauses in a “should” array.
The clauses you can use in order of strictness are:

  • match_phrase - all words have to match in the right order
  • match with AND - all words must be somewhere in the document
  • match with OR - at least one of the words must be in the document (more matching words = more score)
  • fuzzy or “ngram” matching - match on bits of words rather than whole words.
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.