I am using shingle and stop filters, trim and filler_token="".
When i run analyze API on a string : "Word1 Word2 StopWord1 StopWord2 Word3 Word4", I get the following SHINGLES which seem to be correct.
i) "Word1 Word2"
ii) "Word2"
iii) "Word3"
iv) "Word3 Word4"
Is there any way to suppress shingles ii) and iii).
I have filtered shingles ii) and iii).
When I query using a bool must clause with the same string, i would expect hits that matches either or both shingles i) and iv) only.
Is the expectation correct ? Right now I see hits that match shingles ii) too.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.