Of course, the regular expressions will be used to match the token
separators, not the tokens themselves.
And, of course, it does involve regular expressions, which is the tool of
very last resort But in this case, it seems like the only option you
have. To make your life easier, I highly recommend that you find a tool
that can analyze an arbitrary string. This would let you change and
fine-tune the analyzer. I wrote my own command-line interface and use it to
develop and to regression-test custom analyzers, and it has been very
helpful.
Brian
On Monday, August 26, 2013 7:27:04 AM UTC-4, Ankit Jain wrote:
Hi All,
We have required list of keyword (like "+","@",..) which standard analyzer
use to tokenize the input value.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.