When I use an analyzer with edge ngram filter and synonym filter
during index time, for synonyms defined as "word => synonym", "word"
is not indexed at all.
Depending on the order of how the filters are defined, the behavior is
different. If the filter list is ["standard", "lowercase", "ngrams",
"synonym"], "word" would be indexed as "w", "wo", "wor", "synonym". If
the order of "ngrams" and "synonyms" is reversed, the indexed tokens
are: "s", "sy", "syn", ... "synony", "word".
What exactly are you trying to do? Have ngrams applied on the synonyms as
well? It probably make sense in this case to reverse teh order, and first
have the synonym filter, and then apply ngram on it.
When I use an analyzer with edge ngram filter and synonym filter
during index time, for synonyms defined as "word => synonym", "word"
is not indexed at all.
Depending on the order of how the filters are defined, the behavior is
different. If the filter list is ["standard", "lowercase", "ngrams",
"synonym"], "word" would be indexed as "w", "wo", "wor", "synonym". If
the order of "ngrams" and "synonyms" is reversed, the indexed tokens
are: "s", "sy", "syn", ... "synony", "word".
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.