As per my research, autocompletion in ES can be implemented in five (may be more than five) different ways
- Edge Ngrams
- Completion Suggesters
- Prefix queries and Filters
- Phrase Prefix matchers
- Wildcard queries (worst idea, i know)
Understand that each strategy has its own pros and cons.
Given the following requirements, what would be the best strategy?
Must have features
- Should be able to search case insensitive
- Should not ignore space in the search text
- Should be able to search and returns results for even space character
- Should return all possible values of that field if the search text is an empty string
- Maintain strings as not_analyzed
- Prefix and Suffix matching
- Multi-word query support
Good to have
- Fuzzy matching
- Ignore spelling mistakes
We're currently using lowercase filter and wildcard queries ($searchterm) on not_analyzed fields which delivers all the must have features i mentioned above. This is a bad idea and queries are getting slower day by day.
Our primary objective is performance. We can sacrifice some disk space for performance. Also, most of the fields in doc has a finite set of possible values.
These fields will be used also for terms aggregation and some sorting as well. Is it possible to maintain the fields not_analyzed (keyword) with out using "fields" and achieve autocompletion with all the features i mentioned above?