Splitting/tokenizing a Query based on content in an Index (for autosuggest/typeahead/query disambiguation)

Hi All:
I have this problem I am trying to solve :

Split a text query into multiple segments based on matches in an index. I have an index which has content that is used for typeahead autosuggestions. This index consists of a text value and an associated type. The Query string would either be a value that would match an entity wholly (or) partially or could be comprised of parts where each part matches partially or wholly different types. The analysis needs to happen for each keystroke/character that is entered. (so the results would allow for better suggestion of appropriate Textual query construction as an autosuggestion

For example :
Query Text : John Smith could possibly return a set of documents back where john smith matches the test value.

Consider the query :
john law enforcement - This could have 3 different interpretations :

  1. John is an entity value and law enforcement is the other entity value,
  2. John law is an entity and enforcement is the other entity
  3. John Law enforcement is the whole entity.

Considering this example , I need to eliminate any of the options if the values do not exist within ES either wholly or partially.

Is it possible to run some analyzer/analysis to obtain this behavior where I am able to reliably determine the various possible entities ?



This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.