I have this requirement: create SEARCH-AS-TYPE solution with Spring and Elasticsearch couple. There will be one 10 million documents on Elasticsearch. Each document is really simple json like {“id”:9999, “FullName”}. After the user type the third letter I have to filter. the user can keep typing and the result should short. According to https://www.elastic.co/guide/en/elasticsearch/reference/7.6/analysis-edgengram-tokenizer.html “ … When you need search-as-you-type for text which has a widely known order, such as movie or song titles, the completion suggester is a much more efficient choice than edge N-grams. Edge N-grams have the advantage when trying to autocomplete words that can appear in any order…” As far as I can see, my requirement fills the demand of “… trying to autocomplete words that can appear in any order” so I would conclude that Edge N-grams is my best bet. Nevertheless I read also from https://www.elastic.co/guide/en/elasticsearch/reference/7.6/search-as-you-type.html “… The search_as_you_type
field type is a text-like field that is optimized to provide out-of-the-box support for queries that serve an as-you-type completion use case.”. that said, I am confused. Based on my business requirement which approach would you favor? Is search_as_you_type
field datatype complementary to Edge N-grams tokenizer? Can use both approach or should I pick up one of them? Would you favor even a third approach? Any tip will be highly appreciated.
To summarize, what is the recommended rule of thumb for this cases:
- user type three letters. I have to bring all names that contain such three letters no matter in beggin, middle or end of the name. While the user keep typing the result list will reduce.