If there are only few possible values for a field, lets say max 20 possible values for a field, is it a good idea to use edgeNgram tokenizer for autocompletion?
Just curious if EdgeNGram Tokenizer has any advantage over other tokenizers especially if the possible values are finite.
I assume since the values are finite, it won't consume that much disk space for ngrams regardless of the number of documents and can easily fit in memory (assuming limited number of fields in each doc). Is this true?
If there are only 20 values in total, I would probably use a prefix query directly, since the prefix query will be rewritten to a bool query that has 20 terms at most, which is very reasonable.
Your assumption about the cost of edge ngrams being lower when the number of unique values is contained is correct.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.