How do I identify and remove words from a search phrase that don't exist in my elasticsearch index using suggetors?

Richard_Mann · October 31, 2019, 7:58am

Hello, I'm trying to get support on the following:

In summary, please assume the following for my situation:

I am trying to create a 'did you mean' feature using the ES Suggester feature.
My goal is to turn this "greem dress banana" into this "green dress", where:
- "greem" is a badly spelt word that should be corrected to the word "green" as "green" exists in my index.
- "dress" is a correct word and should be left alone and present in the suggestion as it exists in the index.
- "banana" is a word/term that does not exist anywhere in my index and so should be removed completely.

What I CAN do:

Correct "greem" to "green" works fine.

What I cannot do:

Identify that "banana" does not exist in the index at all and so decide to remove it.

OK, so when I call the suggester service with the phrase "greem dress banana" like so:

{"suggest":{"text":"greem+dress+banana","correction-1":{"term":{"field":"_docs.product_name","suggest_mode":"always"}}}}

Note: I have various fields to check for suggestions as you can see above.

This returns something like this (note that my code converts this to an array, but this is just the same as the JSON returned from ES):

The kay issue I have with this return data is thus...

In the following two bits of data, I cannot tell the difference between a word that is correct and exists and so has no suggestions and a word that simply does not exist AT ALL and so has no suggestions.

                            [1] => Array
                                (
                                    [text] => dress
                                    [offset] => 6
                                    [length] => 5
                                    [options] => Array
                                        (
                                        )

                                )

                            [2] => Array
                                (
                                    [text] => banana
                                    [offset] => 12
                                    [length] => 6
                                    [options] => Array
                                        (
                                        )

                                )

I really feel like this is a missing feature, as ES must be able to tell that "dress" has no suggestions because it is correct/found and so why can it not simply denote this in it's response and save me having to do complex/additional logic (collate, additional searches for each individual word for existence, etc.).

Have I missed something? Is there a way to tell the term suggester to let me know the difference between terms without suggestions due to not existing vs. simply being correct?

system · November 28, 2019, 7:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How I can limit suggest results to existing? Elasticsearch	4	372	July 6, 2017
Dealing with lots of non-dictionary words Elasticsearch	1	374	July 5, 2017
Bug of Term Suggester Elasticsearch	1	334	July 6, 2017
Phrase suggester and non-existing terms Elasticsearch	6	598	July 6, 2017
Is it possible to make phrase suggester not return non-existent suggestions? Elasticsearch	2	353	July 6, 2017

How do I identify and remove words from a search phrase that don't exist in my elasticsearch index using suggetors?

Related topics