Elastic Search - Handling words with 'no/non'

nadaj · April 28, 2018, 4:25pm

How can you handle the correct tokenization and indexing of words such as 'gluten-free', 'no gluten', 'without dairy'? The method chosen can have a consequence on the querying part?

shanec · April 28, 2018, 10:41pm

What you're bringing up here is one of the problems relating to natural language processing. There are a lot of variants of negation and the variants are language (and sometimes regionally) dependent. For example, "gluten-free" and "no gluten" are two examples that you give, but "not gluten-free" is also a reasonable permutation that's effectively an allowable double negative while "not for gluten-sensitive consumers" is on the other side. There is, unfortunately, no simple answer to this type of problem. The most comprehensive way is to try to write language processors (per-language and potentially per-region) to decompose text into/via some kind of parse tree. If you dive deeply into this, you'll quickly end up in a place where people are talking about computational linguistic processing and how "not" is NegP, which is not inherently bad, but a level of depth many people asking this type of question aren't fully ready to take on.

Realistically, doing this type of parsing can get very complex very quickly. Many practitioners avoid this level of complexity and assume a user is going to read the text rather than trying to interpret it computationally for them. It can be easier and actually less error prone to ask the text submitter this information in many cases.

system · May 26, 2018, 10:51pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problem with indexing/searching for the word "no" Elasticsearch	3	325	July 6, 2017
Semantic search - how to get correct results Elastic Search elastic-app-search	2	196	March 27, 2024
RAG with Elasticsearch - exclusion questions Elastic Search elastic-workplace-search	6	592	March 22, 2024
Term negation and fuzziness Elasticsearch	2	430	September 25, 2023
Query "at least one not in given array" Elasticsearch	2	200	November 16, 2022

Elastic Search - Handling words with 'no/non'

Related topics